Let's begin with a warm welcome to the course and delve into the basics of utilizing the OpenAI API with a focus on the functionalities.

### Welcome to the ChatGPT API Course!

Hello everyone, and welcome to this exciting course where we will be exploring the functionalities and capabilities of the OpenAI API, specifically focusing on working with the ChatGPT model. Throughout this course, you will learn how to build applications leveraging the power of ChatGPT to perform a variety of language processing tasks.

### Module 1: Introduction to OpenAI API and ChatGPT

#### **1.1 Overview of OpenAI API**
OpenAI offers a powerful API that grants access to cutting-edge language models trained to understand and generate text effectively. This API is versatile, capable of handling a wide array of tasks involving language processing, including but not limited to:
- Content generation
- Summarization
- Classification and sentiment analysis
- Data extraction
- Translation

#### **1.2 Understanding the Completions Endpoint**
The core of the OpenAI API is the completions endpoint. It operates on a simple yet flexible interface where you provide a text prompt, and the API returns a text completion that aligns with the instructions or context given in the prompt. Essentially, it functions like an advanced autocomplete system, predicting the most probable text to follow your prompt.

### Module 2: Building a Sample Application

In this module, we will walk you through building a sample application, where you will learn the key concepts and techniques fundamental to using the API for any task.

#### **2.1 Crafting Effective Prompts**
The first step in utilizing the API effectively is crafting a clear and specific prompt that communicates your request to the model. You will learn how to "program" the model through prompt design, making your instructions more specific to obtain desired results.

#### **2.2 Incorporating Examples**
Sometimes, instructions alone may not suffice. In such cases, adding examples to your prompt can help convey patterns or nuances, guiding the model to provide the kind of responses you're looking for.

#### **2.3 Adjusting Settings**
Apart from designing prompts, you have the tool of adjusting settings at your disposal. One crucial setting is the "temperature," which controls the diversity of the completions. A higher temperature value (closer to 1) encourages more diverse outputs, while a lower value (closer to 0) makes the completions more deterministic and focused.

### Module 3: Building Your Application

#### **3.1 Setting Up**
In this part of the course, we will guide you through setting up your environment, including installing necessary tools and obtaining your API key.

#### **3.2 Understanding and Implementing the Code**
We will delve deep into the code structure, helping you understand how to generate prompts dynamically and how to send API requests using specific settings to get the desired outputs.

### Getting Started with ChatGPT API: Installation and API Key

#### **Step 1: Sign Up and API Key**

Before you start working with the ChatGPT API, you need to sign up for access to OpenAI's API. Once you have signed up, you will be provided with an API key. This key is essential as it allows you to make requests to the API.

```python
# Your API key will look something like this:
API_KEY = "sk-abcdef1234567890abcdef1234567890"
```

#### **Step 2: Installing OpenAI Python**

Next, install the OpenAI Python package, which is a client library that allows you to interact with the OpenAI API easily. You can install it using the following command:

```bash
pip install openai
```

#### **Step 3: Setting Up Your Environment**

After installing the OpenAI Python package, set up your environment variable to store the API key. This way, you can use the API key in your scripts securely. Use the following command to set the environment variable:

```bash
export OPENAI_API_KEY="sk-abcdef1234567890abcdef1234567890"
```

#### **Step 4: Making Your First API Call**

Now that everything is set up, you can make your first API call using a Python script. Here is a simple script that sends a prompt to the ChatGPT model and receives a response:

```python
import openai

openai.api_key = "sk-abcdef1234567890abcdef1234567890"

response = openai.Completion.create(
  model="gpt-3.5-turbo",
  prompt="Once upon a time,",
  temperature=0.7,
  max_tokens=150,
)

print(response.choices[0].text.strip())
```

---

This is your starting point in working with the ChatGPT API. As you progress through the course, you will learn more about the different parameters and options you can use to tailor the model's responses to your needs. Let's get coding!

---

Let's run this command:

In [3]:
# Installing openai package
!pip install openai

Collecting openai
  Downloading openai-0.28.0-py3-none-any.whl (76 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/76.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━[0m [32m71.7/76.5 kB[0m [31m2.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: openai
Successfully installed openai-0.28.0



#### GEt API Key: [API Keys](https://platform.openai.com/account/api-keys)
#### Sample .env file:

```
OPENAI_API_KEY=sk-abcdef1234567890abcdef1234567890
```

In [4]:
# Export your API Key to environment variable
# Upload .env file
!pip install python-dotenv
from dotenv import load_dotenv
import os
load_dotenv()

Collecting python-dotenv
  Downloading python_dotenv-1.0.0-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.0.0


True

In [5]:
# Import openai API module
import openai

openai.api_key = os.environ["OPENAI_API_KEY"]

messages = [
            {"role": "system", "content": "Complete the sentance"},
            {"role": "user", "content": "Once upon a time, "},
        ]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=150,
)

## Understanding the API call:

Let's break down the different components of the code snippet:

```python
messages = [
            {"role": "system", "content": "Complete the sentance"},
            {"role": "user", "content": "Once upon a time, "},
        ]
```

#### **`messages` Array**

In this part of the code, we define a list of message objects to interact with the ChatGPT model. Each message object contains two properties:

- **`role`**: It specifies the role of the entity sending the message. It can be `"system"`, `"user"`, or `"assistant"`.
- **`content`**: It contains the actual content of the message from the entity.

Here, we have two messages:
1. A system message instructing to "Complete the sentence".
2. A user message providing the initial part of a sentence to be completed.

```python
response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=150,
)
```

#### **API Call**

In this section, we make a call to the OpenAI API to get a response from the ChatGPT model. Let's break down the parameters used in this API call:

- **`model`**: Specifies the version of the ChatGPT model to use. Here, we are using `"gpt-3.5-turbo"`, which is one of the latest and most advanced versions.
- **`messages`**: We pass the `messages` array we defined earlier to set the conversation history and the current prompt.
- **`temperature`**: Set to 1, this parameter controls the randomness in the model's output. A higher value like 1 encourages more diverse and creative responses.
- **`max_tokens`**: This parameter limits the response to a maximum of 150 tokens, controlling the length of the output.

```python
print(response["choices"][0].message["content"].strip())
```

#### **Printing the Response**

Here, we extract and print the content of the message from the response generated by the ChatGPT model. The response object contains a lot of information, but we are specifically interested in the `content` of the first choice in the `choices` array, which contains the generated message from the model. We use the `strip()` method to remove any leading or trailing whitespace from the output.

---

This code snippet essentially sets up a small conversation with the ChatGPT model, where it is instructed to complete a sentence provided by the user, and then prints the model's response to the console. It is a simple yet powerful demonstration of how you can interact with ChatGPT using the OpenAI API.

In [6]:
# Print entire response:
print("Response: ")
print(response, "\n")

# Print only the output:
print("Output: ")
print(response["choices"][0].message["content"].strip())

Response: 
{
  "id": "chatcmpl-7zatoXn2mzRBX84Ko8iwzDXzuV6z5",
  "object": "chat.completion",
  "created": 1694914204,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "in a faraway kingdom, there lived a brave and kind-hearted princess."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 21,
    "completion_tokens": 15,
    "total_tokens": 36
  }
} 

Output: 
in a faraway kingdom, there lived a brave and kind-hearted princess.


## Understanding the response:

```json
{
  "id": "chatcmpl-7yyzU9qt1dxH9z03FJXtdODZlJCcM",
  "object": "chat.completion",
  "created": 1694768484,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "there was a kingdom ruled by a wise and just king."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 21,
    "completion_tokens": 12,
    "total_tokens": 33
  }
}
```

#### **Response Metadata**

- **`id`**: This is a unique identifier for the API response. It can be used for referencing and tracking individual API calls.
- **`object`**: This field indicates the type of object returned, which in this case is a "chat.completion".
- **`created`**: This is a timestamp indicating when the response was created. It is represented in Unix time format.
- **`model`**: This field specifies the version of the model used to generate the response. Here, it is "gpt-3.5-turbo-0613".

```json
"choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "there was a kingdom ruled by a wise and just king."
      },
      "finish_reason": "stop"
    }
  ],
```

#### **Choices Array**

- **`choices`**: This is an array containing the responses generated by the model. Although we requested a single response, it is possible to ask for more choices by setting the `n` parameter in the API call.
  - **`index`**: This field indicates the index of the choice in the array. Since we have only one choice, the index is 0.
  - **`message`**: This is an object containing the details of the message generated by the assistant.
    - **`role`**: Specifies the role of the entity that generated the message, which is "assistant" in this case.
    - **`content`**: Contains the actual content of the message generated by the assistant.
  - **`finish_reason`**: Indicates why the assistant stopped generating further content. Here, it stopped because it reached a logical stopping point, denoted by "stop".

```json
"usage": {
    "prompt_tokens": 21,
    "completion_tokens": 12,
    "total_tokens": 33
  }
```

#### **Usage Information**

- **`usage`**: This section provides details about the token usage in the API call.
  - **`prompt_tokens`**: Indicates the number of tokens used in the prompt. In this case, it is 21 tokens.
  - **`completion_tokens`**: Specifies the number of tokens generated in the completion. Here, it is 12 tokens.
  - **`total_tokens`**: Represents the total number of tokens used in the API call, which is the sum of the prompt and completion tokens, amounting to 33 tokens.

---

## Multiple responses in the API call:

```python
response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=150,
  n=5
)
```

#### **`n` Parameter**

- **`n`**: This parameter controls the number of responses (or "completions") we want to receive from the API for a single prompt. By setting it to 5, we are instructing the API to generate 5 different completions for the prompt provided in the `messages` array.

```python
for i in range(5):
  # Print only the output:
  print(f"Output {i+1}: ")
  print(response["choices"][i].message["content"].strip())
```

#### **Printing Multiple Responses**

In this section of the code:

- We loop through the range of 5 (since we requested 5 responses) using a for loop.
- Inside the loop, we print a header indicating the output number (e.g., "Output 1", "Output 2", etc.).
- Following the header, we print each individual response generated by the ChatGPT model. We access each response from the `choices` array in the response object using the loop index `i`.
- The `strip()` method is used to remove any leading or trailing whitespace from each output.

---

This modification to the script allows us to explore a variety of responses that the ChatGPT model can generate for a single prompt, giving us more options to choose from or analyze. It's a great way to see the different directions in which the conversation can go based on a single prompt.

In [7]:
# Getting multiple response from the API:
messages = [
            {"role": "system", "content": "Complete the sentance"},
            {"role": "user", "content": "Once upon a time, "},
        ]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=150,
  n=5
)

for i in range(5):
  # Print only the output:
  print(f"Output {i+1}: ")
  print(response["choices"][i].message["content"].strip())

Output 1: 
there was a princess named Aurora.
Output 2: 
there was a young girl named Alice who lived in a small village.
Output 3: 
there was a beautiful princess who lived in a magnificent castle.
Output 4: 
in a faraway land, there lived a brave and adventurous princess.
Output 5: 
in a faraway land, lived a young princess named Aurora.


## Understanding temprature

Let's discuss the role of the `temperature` parameter in our script and how it influences the responses generated by the ChatGPT model:

### Understanding the `temperature` Parameter

In our script, we are experimenting with two different temperature settings - 0 and 1.5 - to observe how they affect the output of the ChatGPT model. The valid range for it is [0. 2]. Let's delve into what the `temperature` parameter does and how it is reflected in our script:

#### **Temperature Set to 0**

```python
print("Temperature: 0")
for i in range(5):
  response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=messages,
    temperature=0,
    max_tokens=150,
  )
  ...
```

- **`temperature=0`**: Setting the temperature to 0 makes the model's output completely deterministic, choosing the most likely next word at each step in the generation process. As a result, no matter how many times we run the loop, we will receive the exact same output because the model will always choose the most probable word to follow the prompt.

#### **Temperature Set to 1.5**

```python
print("Temperature: 1.5")
for i in range(5):
  response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=messages,
    temperature=1.5,
    max_tokens=150,
  )
  ...
```

- **`temperature=1.5`**: Setting the temperature to 1.5 allows for a higher degree of randomness in the output. The model is more likely to take "creative liberties," choosing less probable words to follow the prompt, which results in more diverse and varied outputs. When we run the loop multiple times with a temperature of 1.5, we can notice that the outputs can be quite different from each other, showcasing a range of possible completions for the prompt.

---

In [8]:
# Getting multiple response from the API:
messages = [
            {"role": "system", "content": "Complete the sentance"},
            {"role": "user", "content": "Today is a wonderful day for  "},
        ]

print("Temprature: 0")
for i in range(5):
  response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=messages,
    temperature=0,
    max_tokens=150,
  )

  # Print only the output:
  print(f"Output: ")
  print(response["choices"][0].message["content"].strip())

Temprature: 0
Output: 
a picnic in the park.
Output: 
a picnic in the park.
Output: 
a picnic in the park.
Output: 
a picnic in the park.
Output: 
a picnic in the park.


In [9]:
# Getting multiple response from the API:
messages = [
            {"role": "system", "content": "Complete the sentance"},
            {"role": "user", "content": "Today is a wonderful day for  "},
        ]

print("Temprature: 1.5")
for i in range(5):
  response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=messages,
    temperature=1.5,
    max_tokens=150,
  )

  # Print only the output:
  print(f"Output: ")
  print(response["choices"][0].message["content"].strip())

Temprature: 1.5
Output: 
experiencing newfound joy and gratitude.
Output: 
celebrating and enjoying the outdoors.
Output: 
relaxation and enjoying the sunshine.
Output: 
outdoor activities.
Output: 
spending time outdoors.


## Understanding max-tokens

Let's delve into the `max_tokens` parameter, which is used to control the length of the response generated by the ChatGPT model.

### Understanding the `max_tokens` Parameter

The `max_tokens` parameter allows us to limit the length of the response generated by the ChatGPT model. By setting a specific number of tokens, we can control how verbose or concise the model's responses will be. Let's see how it works with different settings:

#### **Example 1: Setting `max_tokens` to a Small Value**

```python
response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=5,
)

print("Max Tokens: 5")
print(response["choices"][0].message["content"].strip())
```

- **`max_tokens=5`**: In this setting, the output will be very short, limited to just 5 tokens. This might result in responses that are cut-off and don't convey a complete thought.

#### **Example 2: Setting `max_tokens` to a Moderate Value**

```python
response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=20,
)

print("Max Tokens: 20")
print(response["choices"][0].message["content"].strip())
```

- **`max_tokens=20`**: Here, the model has more room to generate a fuller response, but it is still relatively concise, ensuring that the output remains focused and to the point.

#### **Example 3: Setting `max_tokens` to a Large Value**

```python
response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=150,
)

print("Max Tokens: 150")
print(response["choices"][0].message["content"].strip())
```

- **`max_tokens=150`**: With this setting, the model can generate much longer and more detailed responses, potentially providing more context or exploring different facets of the topic at hand.

---

The `max_tokens` parameter is a powerful tool to control the length of the responses generated by the ChatGPT model. Depending on your specific use case, you might choose to limit the responses to a few tokens or allow for longer, more detailed responses. It's a crucial parameter to experiment with to get the desired output from the model.

Remember, a "token" in GPT-3 can be as short as one character or as long as one word, so the number of tokens doesn't directly translate to the number of words.

In [10]:
# max_tokens variations
messages = [
            {"role": "system", "content": "Complete the sentance"},
            {"role": "user", "content": "List of things you can do in San Diego are: "},
        ]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=5,
)

print("Max Tokens: 5")
print(f"List of things you can do in San Diego are: {response['choices'][0].message['content'].strip()}")
print("\n")

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=20,
)

print("Max Tokens: 20")
print(f"List of things you can do in San Diego are: {response['choices'][0].message['content'].strip()}")
print("\n")

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=1,
  max_tokens=150,
)

print("Max Tokens: 150")
print(f"List of things you can do in San Diego are: {response['choices'][0].message['content'].strip()}")

Max Tokens: 5
List of things you can do in San Diego are: 1. Visit the San


Max Tokens: 20
List of things you can do in San Diego are: 1. Visit the beaches 
2. Explore Balboa Park 
3. Visit the San Diego Zoo


Max Tokens: 150
List of things you can do in San Diego are: visit the famous San Diego Zoo, explore Balboa Park, relax on the stunning beaches, go whale watching, visit the USS Midway Museum, enjoy a day at SeaWorld, catch a Padres baseball game at Petco Park, take a harbor cruise, hike in Torrey Pines State Natural Reserve, experience the vibrant Gaslamp Quarter, explore the historic Old Town, go shopping at the Fashion Valley Mall, try out the local craft breweries, and dine on delicious seafood cuisine.


### Note

Notice it takes longer when the max_tokens number is high. It takes longer for the API to generate longer responses. This is something to be considered according the API call use case

## Understanding different model variants

Let's delve into the details of the specified models:

### **GPT-3.5-turbo**

- **Description**: This is the most capable and cost-effective model in the GPT-3.5 series. It has been optimized for chat applications but also performs well in traditional completion tasks.
- **Max Tokens**: It can handle up to 4,097 tokens per request.
- **Training Data**: The model was trained on data available up until September 2021.
- **Usage**: It is recommended for a wide variety of language tasks due to its lower cost and improved performance compared to other models in the GPT-3.5 series.

### **GPT-3.5-turbo-16k**

- **Description**: This model shares the same capabilities as the standard GPT-3.5-turbo but allows for a larger context window, with up to 16,385 tokens.
- **Max Tokens**: It can process up to 16,385 tokens per request, providing a broader scope for understanding and generating content based on extensive prompts.
- **Training Data**: Like the standard GPT-3.5-turbo, it was trained on data available until September 2021.
- **Usage**: It is suitable for tasks that require a larger context or more detailed responses, facilitating deeper and more nuanced conversations.

### **GPT-4**

- **Description**: GPT-4 represents a significant advancement over the previous generations, offering greater accuracy in solving complex problems thanks to its broader general knowledge and advanced reasoning capabilities. It is optimized for chat applications but also works well for traditional completion tasks.
- **Max Tokens**: The standard GPT-4 model can handle up to 8,192 tokens per request. There is also a variant with a larger context window that can handle up to 32,768 tokens.
- **Training Data**: The model was trained on data available up until September 2021.
- **Usage**: It is accessible to developers who have made at least one successful payment through the OpenAI developer platform. It is recommended for tasks that require advanced reasoning and a deep understanding of complex topics.

---


GPT-4 (Input: $0.03 / 1k tokens, Output: 	$0.06 / 1k tokens) API costs are a lot higher compared to GPT-3.5, For turbo it is 	Input: $0.0015 / 1K tokens, Output:	$0.002 / 1K tokens and for turbo-16k it is 	Input: $0.003 / 1K tokens, Output:	$0.004 / 1K tokens

For smaller tasks, the cost of 3.5-turbo will be half od 3.5-turbo-16k. But the length of input context in 3.5-turbo is limited to 4k, while it is 16k in 3.5-turbo-16k

In [11]:
# max_tokens variations
messages = [
            {"role": "system", "content": "Describe an ideal candidate for a job title"},
            {"role": "user", "content": "computer vision engineer"},
        ]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
  temperature=0,
  max_tokens=100,
)

print("gpt-3.5-turbo \n")
print(response['choices'][0].message['content'].strip())
print("\n")

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
  temperature=0,
  max_tokens=100,
)

print("gpt-3.5-turbo-16k \n")
print(response['choices'][0].message['content'].strip())
print("\n")

response = openai.ChatCompletion.create(
  model="gpt-4",
  messages=messages,
  temperature=0,
  max_tokens=100,
)

print("gpt-4 \n")
print(response['choices'][0].message['content'].strip())

gpt-3.5-turbo 

An ideal candidate for the job title of computer vision engineer would possess a combination of technical skills, experience, and personal qualities. 

First and foremost, the ideal candidate would have a strong background in computer science, with a focus on computer vision and machine learning. They should have a deep understanding of algorithms, image processing techniques, and computer vision frameworks. They should also be proficient in programming languages such as Python, C++, and MATLAB.

In terms of experience, the ideal candidate would have a proven track


gpt-3.5-turbo-16k 

An ideal candidate for the job title of computer vision engineer would possess a combination of technical skills, experience, and personal qualities. 

First and foremost, the ideal candidate would have a strong background in computer science, with a focus on computer vision and machine learning. They should have a deep understanding of algorithms, image processing techniques, and comput

## Understanding the differnece between gpt-3.5-turbo and gpt-3.5-turbo-16k

Let's explore how the GPT-3.5-turbo and GPT-3.5-turbo-16k models differ, especially in terms of handling different lengths of context:

### **GPT-3.5-turbo**

#### **Short Context**

```python
messages = [
    {"role": "user", "content": "Tell me a joke."},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())
```

- **Description**: In this scenario, we are providing a short context where a user is asking for a joke. The GPT-3.5-turbo model would easily handle this and generate a joke in response.

#### **Long Context**

```python
messages = [
    {"role": "user", "content": "..." * 4000},  # A very long context
    {"role": "user", "content": "What is the capital of France?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())
```

- **Description**: Here, we are providing a very long context, nearing the maximum token limit of the model. The model might still be able to respond correctly, but the available space for the response would be very limited due to the large context.

#### **Very Long Context**

```python
messages = [
    {"role": "user", "content": "..." * 6000},  # A very long context
    {"role": "user", "content": "What is the capital of France?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())
```

- **Description**: Here, we are providing a very long context, exceeding the maximum token limit of the model. The model will give an error.

### **GPT-3.5-turbo-16k**

#### **Short Context**

```python
messages = [
    {"role": "user", "content": "Tell me a joke."},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())
```

- **Description**: Similar to the GPT-3.5-turbo model, the GPT-3.5-turbo-16k model would easily handle a short context and provide a joke in response.

#### **Long Context**

```python
messages = [
    {"role": "user", "content": "..." * 16000},  # A very long context
    {"role": "user", "content": "What is the capital of France?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())
```

- **Description**: In this scenario, we are providing an extremely long context, utilizing the large token capacity of the GPT-3.5-turbo-16k model. Despite the long context, the model would still have ample space to generate a detailed response, answering the question correctly.

---

### **Conclusion**

- **GPT-3.5-turbo**: Suitable for most applications, but can struggle with very long contexts due to its maximum token limit of 4,097.
- **GPT-3.5-turbo-16k**: Ideal for applications requiring a large context window, as it can handle up to 16,385 tokens, facilitating deeper and more nuanced conversations with extensive contexts.

By varying the length of the context given, you can observe how the different models handle short and long contexts and choose the model that best suits your application's needs.


In [12]:
# Very short context with gpt-3.5-turbo
messages = [
    {"role": "user", "content": "Tell me a joke."},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())

Sure, here's one for you:

Why don't scientists trust atoms?

Because they make up everything!


In [13]:
# Long context with gpt-3.5-turbo
messages = [
    {"role": "user", "content": "..." * 4000},  # A very long context
    {"role": "user", "content": "Tell me a joke"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())

Sure! Here's a classic one for you: 

Why don't skeletons fight each other?

Because they don't have the guts!


In [14]:
# Very Long context with gpt-3.5-turbo - exceeding 4k limit
messages = [
    {"role": "user", "content": ". " * 5000},  # A very long context
    {"role": "user", "content": "What is the capital of France?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())

InvalidRequestError: ignored

In [15]:
# Very Long context with gpt-3.5-turbo-16k
messages = [
    {"role": "user", "content": ". " * 5000},  # A very long context
    {"role": "user", "content": "What is the capital of France?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())

The capital of France is Paris.


## Understanding tokens

We discussed having max tokens and models having limit in tokens, let's delve deeper into what is a token.

### Part 1: Understanding Tokenizers

#### What are Tokenizers?
Tokenizers are tools used in natural language processing (NLP) to convert input text into a format (a series of tokens) that can be fed into a language model. A token can be as small as a character or as long as a word.

#### Why are they used in Language Models like ChatGPT?
Tokenizers help in breaking down the input text into smaller units, making it easier for the model to analyze and understand the text. It aids in:
- Reducing the complexity of the text
- Facilitating the identification of patterns in the text
- Enhancing the model's ability to generalize and understand grammar through the recognition of common subwords

### Part 2: Introduction to Tiktoken [Tokenizer](https://platform.openai.com/tokenizer)

#### What is Tiktoken?
Tiktoken is a fast Byte Pair Encoding (BPE) tokenizer developed by OpenAI for use with its models, including ChatGPT. It is designed to be faster and more efficient compared to other open-source tokenizers.

#### How does Tiktoken work?
Tiktoken utilizes BPE to tokenize text. BPE works by iteratively merging frequent pairs of characters in the training data until a desired vocabulary size is reached. This way, it can represent common words as single tokens and rare words as sequences of subword tokens.

### Part 3: Using Tiktoken

#### Installing Tiktoken
To use Tiktoken, you first need to install it from PyPI using the following command:
```python
pip install tiktoken
```

#### Basic Usage
Here is how you can use Tiktoken to tokenize a piece of text:
```python
import tiktoken
enc = tiktoken.get_encoding("cl100k_base")
tokens = enc.encode("hello world")
print(tokens)
```

### Part 4: Understanding Components

#### Encoding and Decoding
In Tiktoken, you can encode a text to tokens and decode it back to the original text using the `encode` and `decode` methods respectively. Here is an example:
```python
assert enc.decode(enc.encode("hello world")) == "hello world"
```

#### Educational Submodule
Tiktoken contains an educational submodule that helps users learn more about BPE. It includes functionalities to train a BPE tokenizer on a small amount of text and visualize how the GPT-4 encoder encodes text.

### Part 5: Visualizing Token Changes

To visualize how tokens change as text is altered, we can use Tiktoken's educational submodule. Here is a small piece of code that demonstrates this:
```python
from tiktoken._educational import *

# Train a BPE tokenizer on a small amount of text
enc = train_simple_encoding()

# Visualize how the GPT-4 encoder encodes text
enc = SimpleBytePairEncoding.from_tiktoken("cl100k_base")
encoded_text = enc.encode("hello world aaaaaaaaaaaa")
print(encoded_text)
```

In this script:
- We first import the educational submodule from Tiktoken.
- We train a simple BPE tokenizer using a small amount of text.
- We visualize how the GPT-4 encoder encodes a piece of text, and print the encoded text to see the tokens.


In [16]:
# Install the tiktoken package
!pip install tiktoken

Collecting tiktoken
  Downloading tiktoken-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.1/2.0 MB[0m [31m4.2 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.7/2.0 MB[0m [31m10.0 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━[0m [32m1.4/2.0 MB[0m [31m13.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m14.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tiktoken
Successfully installed tiktoken-0.5.1


In [17]:
# Basic usage:
import tiktoken
enc = tiktoken.get_encoding("cl100k_base")
tokens = enc.encode("hello world")
print(tokens)

[15339, 1917]


In [18]:
# deconde the encoded tokens
tokens = enc.encode("hello world")
print(enc.decode(tokens))

hello world


In [19]:
# Visualizing the tokens:
from tiktoken._educational import *

# Visualize how the GPT-4 encoder encodes text
enc = SimpleBytePairEncoding.from_tiktoken("cl100k_base")
encoded_text = enc.encode("Today is a wonderful day for a webinar on ChatGPT API.")
print(encoded_text)
print("Number of tokens: ", len(encoded_text))

[48;5;167mT[48;5;179mo[48;5;185md[48;5;77ma[48;5;80my[0m
[48;5;167mT[48;5;179mod[48;5;77ma[48;5;80my[0m
[48;5;167mT[48;5;179mod[48;5;77may[0m
[48;5;167mT[48;5;179moday[0m
[48;5;167mToday[0m

[48;5;167m [48;5;179mi[48;5;185ms[0m
[48;5;167m [48;5;179mis[0m
[48;5;167m is[0m

[48;5;167m [48;5;179ma[0m
[48;5;167m a[0m

[48;5;167m [48;5;179mw[48;5;185mo[48;5;77mn[48;5;80md[48;5;68me[48;5;134mr[48;5;167mf[48;5;179mu[48;5;185ml[0m
[48;5;167m [48;5;179mw[48;5;185mo[48;5;77mn[48;5;80md[48;5;68mer[48;5;167mf[48;5;179mu[48;5;185ml[0m
[48;5;167m [48;5;179mw[48;5;185mon[48;5;80md[48;5;68mer[48;5;167mf[48;5;179mu[48;5;185ml[0m
[48;5;167m w[48;5;185mon[48;5;80md[48;5;68mer[48;5;167mf[48;5;179mu[48;5;185ml[0m
[48;5;167m w[48;5;185mon[48;5;80md[48;5;68mer[48;5;167mf[48;5;179mul[0m
[48;5;167m w[48;5;185mon[48;5;80mder[48;5;167mf[48;5;179mul[0m
[48;5;167m w[48;5;185mon[48;5;80mder[48;5;167mful[0m
[48;5;167m won[48;

## **Understanding Roles in ChatGPT API**

In the ChatGPT API, roles are used to define the different entities that can send messages in a conversation. There are three primary roles: `system`, `user`, and `assistant`. Let's understand each role with examples and code snippets:

#### **1. System Role**

The `system` role is used to set the behavior of the ChatGPT model in a conversation. It helps in defining the context or scope of the conversation.

```python
messages = [
    {"role": "system", "content": "You are a football expert"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())
```

In this example, the system role instructs the ChatGPT to assume the role of a "football expert," guiding its responses to align with this persona.

#### **2. User Role**

The `user` role represents the end-user or a human who is interacting with the ChatGPT. It is used to send prompts or questions to the ChatGPT.

```python
# User role example:
messages = [
    {"role": "system", "content": "You are a football expert"},
    {"role": "user", "content": "Who won the FIFA World Cup in 2014?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())
```

Here, the user role is utilized to ask a question about the FIFA World Cup 2018, to which the ChatGPT responds accurately.

#### **3. Assistant Role**

The `assistant` role represents the ChatGPT model itself responding to the user's prompts. It is used to maintain the continuity of the conversation by including previous responses from the model in the current request.

```python
# Assitant role to maintaine continuity
messages = [
    {"role": "system", "content": "You are a football expert"},
    {"role": "user", "content": "Who won the FIFA World Cup in 2014?"},
    {"role": "assistant", "content": "Germany won the FIFA World Cup in 2014."},
    {"role": "user", "content": "Who won the next world cup after that?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())
```

In this snippet, the assistant role is used to include the model's previous response in the current request, ensuring a coherent and continuous conversation.

### **Conclusion**

Understanding and utilizing roles correctly is vital in crafting conversations with the ChatGPT API. It allows developers to control the behavior of the ChatGPT model, facilitate user interactions, and maintain a smooth conversation flow. By using different roles, one can guide the conversation in a desired direction and obtain responses that align with the specified context or persona.

In [20]:
# System Role example:
messages = [
    {"role": "system", "content": "You are a football expert"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())

and have been asked to predict the outcome of the upcoming UEFA Champions League final between Chelsea and Manchester City. After analyzing both teams' recent form, playing styles, and key players, I believe Manchester City has a slight edge in this match.

Manchester City has had an outstanding season domestically, winning the Premier League title comfortably. They have displayed exceptional attacking prowess, scoring a league-high 83 goals, and are known for their possession-based, free-flowing style of play. With talented players like Kevin De Bruyne, Phil Foden, and Riyad Mahrez, City possesses a deadly attacking force that can unlock any defense.

Defensively, Manchester City has been brilliant as well, conceding just 32 goals in 38 league matches. Their rock-solid defense, led by Ruben Dias and John Stones, has been crucial to their success this season.

On the other hand, Chelsea has also shown tremendous improvement since Thomas Tuchel took charge. They reached the final of the

In [21]:
# User role example:
messages = [
    {"role": "system", "content": "You are a football expert"},
    {"role": "user", "content": "Who won the FIFA World Cup in 2014?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())

The FIFA World Cup in 2014 was won by Germany. They defeated Argentina 1-0 in the final, with Mario Götze scoring the winning goal in extra time.


In [22]:
# Without assitant role to maintaine continuity
messages = [
    {"role": "system", "content": "You are a football expert"},
    {"role": "user", "content": "Who won the next FIFA World Cup?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())

As an AI, I don't have real-time information or the ability to predict future events. Therefore, I cannot accurately tell you who will win the next FIFA World Cup. The next World Cup will take place in Qatar in 2022, and only time will tell which team will emerge as the victor.


In [25]:
# Assitant role to maintaine continuity
messages = [
    {"role": "system", "content": "You are a football expert"},
    {"role": "user", "content": "Who won the FIFA World Cup in 2014?"},
    {"role": "assistant", "content": "Germany won the FIFA World Cup in 2014."},
    {"role": "user", "content": "Who won the next world cup after that?"},
]

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo-16k",
  messages=messages,
)

print(response["choices"][0].message["content"].strip())

The next FIFA World Cup after 2014 was held in 2018, and it was won by France.
