# Table of Contents
### 1. [Introduction](#Introduction)
### 2. [Set Up Python Virtual Environment (venv), Dependencies, and Jupyter Instance](#Set-Up-Python-Virtual-Environment-(venv),-Dependencies,-and-Jupyter-Instance)
### 3. [Overview of the Chat Completion API](#Overview-of-the-Chat-Completion-API)
### 4. [Example ChatCompletion.create() Calls](#Example-ChatCompletion.create()-Calls)
### 5. [Creating and Managing a Basic Conversation Loop](#Creating-and-Managing-a-Basic-Conversation-Loop)

## Introduction

The ChatGPT and GPT-4 models are optimized for conversational interfaces and work differently than the older GPT-3 models. They are conversation-in and message-out, and require input formatted in a specific chat-like transcript format. Azure OpenAI provides two different options for interacting with these models: Chat Completion API and Completion API with Chat Markup Language (ChatML). 
The Chat Completion API is the preferred method for accessing these models, while ChatML provides lower level access but requires additional input validation and only supports ChatGPT models. It's important to use the techniques described in the article to get the best results from the new models.

This notebook will cover the aspects of the Chat Completion Python API with conversation, roles (system, assistant, user) and examples of different usage scenarios.

## Set Up Python Virtual Environment (venv), Dependencies, and Jupyter Instance

##### Create Python Virtual Environment (venv)

**Set up your environment (You can skip this if you have a working environment)**
**These commands should be entered in the command line or console**

**Instructions as written are for command prompt within VS Code**

**Instructions were written for Python 3.10.4**

**Create virtual environment to start - the first portion of the code below**
```commandline
C:\<Location>\<To your python install>\python -m venv C:\<Location>\<You want to create your virtual environment>
```

**Change directory to new virtual environment and activate**
```commandline
cd C:\<Location>\<You created your virtual environment in>\scripts

Activate
```

**Change directory to code repository**
```commandline
cd C:\<Location>\<To your code repository>
```

##### Install dependencies

In the command line, install the following packages via pip install

```commandline
 pip install <package name>
 ```

* openai
* jupyterlab

##### Launch Jupyter instance

Launch your Jupyter instance by starting the jupyter server:

```commandline
jupyter-lab
```

You may need to select your virtual environment as the kernel for your notebook - this can be done in the top right corner of your VS Code Instance

##### Set Environment Variables

Add in the environment variables for your OpenAI API Key and API Base URL. You can find these values in the Azure Portal under your OpenAI resource.

We can add in new variables as such:
Add in new environment variables
os.environ['NEW_VARIABLE_NAME'] = '/new/value'

```python
import os
os.environ['OPENAI_API_KEY'] = '<YOUR OPENAI API KEY'
os.environ['OPENAI_API_BASE'] = '<YOUR OPENAI API ENDPOINT URL>'
os.environ['OPEN_AI_ENGINE'] = '<NAME UNDER WHICH YOU DEPLOYED YOUR MODEL>'
```

In [1]:
# Set your environment variables here or hard-code them (not recommended)
import os
os.environ['OPENAI_API_KEY'] = 'a3bb7d6279d94ae9a61905b2845cbe64'
os.environ['OPENAI_API_BASE'] = 'https://embeddings-openaiplayground.openai.azure.com/'
os.environ['OPEN_AI_ENGINE'] = 'chat'


Call os.getenv() to retrieve the value of an environment variable. If the variable does not exist, os.getenv() returns None. In that case you may need to verify
that you are using the same name as the environment variable you created or try re-running the os.environ command.

```python
import openai
openai.api_type = "azure"
openai.api_version = "2023-03-15-preview" 
openai.api_base = os.getenv("OPENAI_API_BASE")  # Your Azure OpenAI resource's endpoint value.
openai.api_key = os.getenv("OPENAI_API_KEY")  # Your Azure OpenAI resource's key value.
openai_engine = os.getenv("OPEN_AI_ENGINE") # The name you gave your OpenAI Model deployment in Azure
```

In [2]:
import openai
openai.api_type = "azure"
openai.api_version = "2023-03-15-preview" 
openai.api_base = os.getenv("OPENAI_API_BASE")  # Your Azure OpenAI resource's endpoint value.
openai.api_key = os.getenv("OPENAI_API_KEY")  # Your Azure OpenAI resource's key value.
openai_engine = os.getenv("OPEN_AI_ENGINE") # The name you gave your OpenAI Model deployment in Azure

## Overview of the Chat Completion API

> **Note:** The following parameters aren't available with the new ChatGPT and GPT-4 models: **logprobs**, **best_of**, and **echo**. If you set any of these parameters, you'll get an error. gpt-35-turbo is equivalent to the gpt-3.5-turbo model from OpenAI.

##### ChatCompletion.create()
OpenAI trained the ChatGPT and GPT-4 models to accept input formatted as a conversation. The messages parameter takes an array of dictionaries with a conversation organized by role. The three types of roles are:

* system
* assistant
* user 

A sample input containing a simple system message, a one-shot example of a user and assistant interacting, and the final "actual" user-supplied prompt is shown below:

```json
{"role": "system", "content": "Provide some context and/or instructions to the model."},
{"role": "user", "content": "Example question goes here."}
{"role": "assistant", "content": "Example answer goes here."}
{"role": "user", "content": "First question/message for the model to actually respond to."}
```

Let's dive deeper into our 3 possible roles types of system, user, and assistant.

##### System Role
The system role, also known as the system message, is included at the beginning of the array. This message provides the initial instructions to the model. You can provide various information in the system role including:

* A brief description of the assistant
* Personality traits of the assistant
* Instructions or rules you would like the assistant to follow
* Data or information needed for the model, such as relevant questions from an FAQ

You can customize the system role for your use case or just include basic instructions. The system role/message is optional, but it's recommended to at least include a basic one to get the best results.

##### Assistant Role

The assistant role is that of OpenAI or your assistant. You can omit this role in an intial ChatCompletion.create() call if desired, though it is required if you are going to pass a one-shot or few-shot example through the messages parameter. 

Let's take a look at some examples of the Chat Completion API in action.

##### User Role

The user role is the message that the user sends to the assistant. This is the message that the model will respond to. The user role is required for the model to respond.

> **Note:** To trigger a response from the model, you should end with a user message indicating that it's the assistant's turn to respond. 

In [3]:
# Note that the max_tokens parameter has been set to 500 - this is not requried but suggested that you increase the max tokens to allow for longer responses
# so they do not get cut-off mid-response.

# openai_engine parameter is the name supplied to the deployed OpenAI model (This is something that is set at the time the model is deployed)

response = openai.ChatCompletion.create(
    engine=openai_engine, # The deployment name you chose when you deployed the ChatGPT or GPT-4 model.
    messages=[
        {"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
        {"role": "user", "content": "What's the difference between garbanzo beans and chickpeas?"},
    ],
    # temperature=0.75,
    # max_tokens=500,
    # top_p=0.90,
    # frequency_penalty=0,
    # presence_penalty=0,
    # stop=None
)


In the cell above, we supplied two of our roles: system and user. The system message is very simple - it just provides some context for the model. The user message is the prompt that the model will respond to. Let's see what the model returns.

> **Note:** The cell above has some unused parameters commented out. These are not required for the ChatCompletion.create() call to run, but they are available if you want to use them. Please feel free to experiment with these parameters through-out the notebook to see how they affect the model's output. It is typically best practice to adjust either temperature or top_p, but not both, as they have an interaction effect around the determinism of the model's output.

In [4]:
print(response)

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Garbanzo beans and chickpeas are the same thing; they are simply two different names for the same type of bean. They are part of the legume family and have a distinct nutty flavor and a slightly firm texture. In some regions of the world, such as Spain and South America, they are called garbanzos, while in other places, such as the Middle East and India, they are known as chickpeas.",
        "role": "assistant"
      }
    }
  ],
  "created": 1679970822,
  "id": "chatcmpl-6ytRWigzQa3b1QAe5Y3nBW8jUnpD3",
  "model": "gpt-35-turbo",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 91,
    "prompt_tokens": 37,
    "total_tokens": 128
  }
}


In our print(response)'s output above we can see we get an array of responses. We'll mainly focus on the "content" within the "message" returned under "choices". However, it may be helpful to understand a few of the other views, such as "finish_reason". Every response includes a finish_reason. The possible values for finish_reason are:

* **stop**: API returned complete model output.
* **length**: Incomplete model output due to max_tokens parameter or token limit.
* **content_filter**: Omitted content due to a flag from our content filters.
* **null**:API response still in progress or incomplete.

The material supplied under "usage" can also be helpful when trying to keep track of the number of tokens used in your request. The "usage" object includes the following information:
* **completion_tokens**: The number of tokens used to complete the prompt - this should typically be from the "assistant" role.
* **prompt_tokens**: The number of tokens used to prompt the model - this should typically be from the "user" role.
* **total_tokens**: The total number of tokens used in the request.

However, we're most interested in the "content" within the "message" returned under "choices". Let's take a look at the "content" returned in the response above by printing out just the assistant's response by accessing the "content" within the "message" returned under "choices":

Copy and paste this code into the cell below and run it to see the assistant's response:
```python
print(response['choices'][0]['message']['content'])
```

In our print(response['choices'][0]['message']['content'])'s output above we can see that the "assistant" responded by informing the user that both Garbanzo and chickpeas refer to the same. In the next sections, we'll focus on how we can refine our ChatCompletions.create() calls to get more specific responses or fit different scenarios.

## Example ChatCompletion.create() Calls

Our basic system message previously was:
```json
{"role": "system", "content": "Assistant is a large language model trained by OpenAI."}
```

This gives us an assistant who approximates the initial OpenAI ChatGPT assistant. Let's see what happens if we change the system message to:
```json
        {"role": "system", "content": "Assistant is an intelligent chatbot designed to help users answer their tax related questions.\
        Instructions:\
        - Only answer questions related to taxes.\
        - If you are unsure of an answer, you can say 'I do not know' or 'I am not sure' and recommend users go to the IRS website for more information. "}, # Paste the system role message from above between the two curly braces
        {"role": "user", "content": "When are my taxes due?"}
```

To create an assistant to help us file our taxes.

You can copy and paste the above system role message into the cell below and run it to see the assistant's response:


In [5]:
response = openai.ChatCompletion.create(
    engine=openai_engine, # The deployment name you chose when you deployed the ChatGPT or GPT-4 model.
    messages=[
        {PASTE CONTENT FROM ABOVE HERE} # Paste in the role messages from above,
    ],
    temperature=0.50,
    max_tokens=500,
    top_p=0.95,
    frequency_penalty=0,
    presence_penalty=0,
    stop=None
)

print(response['choices'][0]['message']['content'])

The deadline to file your federal income tax return is usually April 15th of each year. However, due to the COVID-19 pandemic, the deadline for filing 2020 taxes has been extended to May 17th, 2021. It's important to note that some states may have different deadlines, so it's a good idea to check with your state's tax agency to confirm.


We can see in the answer above that our assistant got the *typical* date when federal taxes are due correct - April 15th and notes that deadlines were extended during the COVID-19 Pandemic.
However, in 2023, April 15th happens to fall on a Saturday, so the deadline has been pushed back to April 18th. Let's see if we can get our assistant to respond with the correct date.

To do so, let's provide our assistant the correct answer and reasoning in a few-shot example. We'll add the following to our messages call in ChatCompletion.create():
```json
        {"role": "system", "content": "Assistant is an intelligent chatbot designed to help users answer their tax related questions. "},
        {"role": "user", "content": "When do I need to file my taxes by?"},
        {"role": "assistant", "content": "In 2023, you will need to file your taxes by April 18th. The date falls after the usual April 15th deadline because April 15th falls on a Saturday in 2023. For more details, see https://www.irs.gov/filing/individuals/when-to-file."},
        {"role": "user", "content": "How can I check the status of my tax refund?"},
        {"role": "assistant", "content": "You can check the status of your tax refund by visiting https://www.irs.gov/refunds"}
```

Here what we are doing is providing our assistant with a few-shot example of a user asking about the tax deadline and our assistant providing the correct answer. We are also providing our assistant with an example of how to respond if a user requests information about their tax refund and the URL to the IRS website where they can check the status of their refund.

Copy and paste the role messages above into the cell below and run to verify that the assistant's response is now correct:

In [None]:
response = openai.ChatCompletion.create(
    engine=openai_engine, # The deployment name you chose when you deployed the ChatGPT or GPT-4 model.
    messages=[
        {PASTE CONTENT FROM ABOVE HERE} # Paste in the role messages from above, keeping the new user message below
        {"role": "user", "content": "When do I need to file my taxes by?"},
    ],
    temperature=0.50,
    max_tokens=500,
    top_p=0.95,
    frequency_penalty=0,
    presence_penalty=0,
    stop=None
)

print(response['choices'][0]['message']['content'])

Our assistant now responds with the correct date for the 2023 tax deadline!

In the example above, we were able to supply the correct answer to a common question via few shot learning; this approach is similar to GPT-3 models, but the format is slightly different with the back-and-forth-format that ChatCompletion entails. This approach can also be useful in demonstrating a behavior pattern to the model.

##### Expanding the system role message to include more context

Another way we can utilize the ChatCompletion format to interact and adjust our assistant is to supply it with a specific context. As with all promot engineering, the best practicies of being concise, clear, specific, and affirmative (telling the model what to do as opposed what not to do apply.)

Let's see what happens if we provide our assistant with some context about Azure OpenAI Service.

Copy and paste the following system role message into the cell below and run it to see the assistant's response:

```json
        {"role": "system", "content": "Assistant is an intelligent chatbot designed to help users answer technical questions about Azure OpenAI Serivce.\
            Only answer questions using the context below and if you're not sure of an answer, you can say 'I don't know'.\
            Context:\
            - Azure OpenAI Service provides REST API access to OpenAI's powerful language models including the GPT-3, Codex and Embeddings model series.\
            - Azure OpenAI Service gives customers advanced language AI with OpenAI GPT-3, Codex, and DALL-E models with the security and enterprise promise of Azure.\
                 Azure OpenAI co-develops the APIs with OpenAI, ensuring compatibility and a smooth transition from one to the other.\
            - At Microsoft, we're committed to the advancement of AI driven by principles that put people first. Microsoft has made significant investments to help guard against abuse and unintended harm, which includes requiring applicants to show well-defined use cases, incorporating Microsoft’s principles for responsible AI use."
        },
        {"role": "user", "content": "What is Azure OpenAI Service?"}
```

In [7]:
response = openai.ChatCompletion.create(
    engine=openai_engine, # The deployment name you chose when you deployed the ChatGPT or GPT-4 model.
    messages=[
        {PASTE CONTENT FROM ABOVE HERE} # Paste in the role messages from above
    ],
    temperature=0.50,
    max_tokens=500,
    top_p=0.95,
    frequency_penalty=0,
    presence_penalty=0,
    stop=None
)

print(response['choices'][0]['message']['content'])

Azure OpenAI Service is a cloud-based service provided by Microsoft Azure that offers REST API access to OpenAI's language models, including the GPT-3, Codex, and Embeddings model series. It provides advanced language AI capabilities to customers with the security and enterprise promise of Azure. The APIs are co-developed by Azure and OpenAI to ensure compatibility and a smooth transition from one to the other.


##### ChatCompletion for non-chat based tasks

You may wish to take advantage of some of the features that the GPT-3.5 and GPT-4 models offer in scenarios that are not strictly chat-based i.e. task-based scenarios. In many cases, you can supplant what you may have provided as part of a longer **prompt** in the GPT-3 models with a similar message in the **system role** of the ChatCompletion.create() call.


For example, you may wish to utilize the model to extract information from unstructured data and return it in a more structured format. In this example, we may wish to take a voice message and retrieve the relevant information from it. We can do this by providing the model with the voice message and a system role message that describes the task we wish to accomplish:

```json
      {"role": "system", "content": '''
      You are an assistant designed to extract entities from text.
      Users will paste in a string of text and you will respond with entities you've extracted from the text as a JSON object.
      Only respond with the JSON output.
      Here's an example of your output format:
      {
      "name": "",
      "company": "",
      "phone_number": ""
        }
        '''},
      {"role": "user", "content": "Hello. My name is Robert Smith. I'm calling from Contoso Insurance, Delaware. My colleague mentioned that you are interested in learning about our comprehensive benefits policy. Could you give me a call back at (555) 346-9322 when you get a chance so we can go over the benefits?"}
```

In [None]:
response = openai.ChatCompletion.create(
    engine=openai_engine, # The deployment name you chose when you deployed the ChatGPT or GPT-4 model.
    messages=[
        {PASTE CONTENT FROM ABOVE HERE} # Paste in the role messages from above
    ],
    temperature=0.50,
    max_tokens=500,
    top_p=0.95,
    frequency_penalty=0,
    presence_penalty=0,
    stop=None
)

print(response['choices'][0]['message']['content'])

## Creating and Managing a Basic Conversation Loop