# Working with the GPT-3.5-Turbo and GPT-4 models

## Getting an API KEY

The API empowers you with greater control and versatility to work with ChatGPT's responses. It also allows seamless integration with other applications.

To access the model through the API, you will need an API key. For this demo, you'll be using your LT's API KEY.

*To obtain your own API key, you'll need to create an account and set up billing. You can create your account at https://platform.openai.com/.*

In [265]:
api_key = "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"

In [51]:
# alternative
# api_key = open("key.txt", "r").read().strip("\n")

In [266]:
# alternative
import os

# api_key = os.getenv("OPENAI_API_KEY")

## Getting openai Python Package

To utilize the ChatGPT API, you'll need to have the openai Python package installed. 

You can easily install it by running the command pip install --upgrade openai. *Adding the --upgrade flag ensures that you have the most up-to-date version, as the ChatGPT API is a recently introduced feature.*

In [53]:
!pip install --upgrade openai

In [267]:
import openai

# load and set our key
openai.api_key = api_key

## Chat Completions API

Chat models take as mandatory parameters:
- **List of messages as input**
- **Model**: we will use gpt-3.5-turbo

They return a **model-generated message as output**. 

An example API call looks as follows:

In [268]:
response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[ # messages parameter must be a list of dictionaries
    # can be as short as one message or many back and forth turns.  
    {"role": "system", "content": "You are a famous chef. Share your best cooking tips and tricks."}, # each dictionary has a role and content
    {"role": "user", "content": "What are the ingredients for making pancakes?"},
  ]
)

In conversations using the Chat Completion API, each message has a **role ("system," "user," or "assistant").** Typically, a conversation starts with a system message to set the assistant's behavior, followed by alternating user and assistant messages. 
- The system message is optional and can be used to customize the assistant's personality or provide specific instructions
- User messages contain requests or comments
- Assistant messages store previous assistant responses or serve as examples of desired behavior.

### Chat completions response format

In [269]:
# In Python, the assistant’s reply can be extracted with response['choices'][0]['message']['content'].

print(response['choices'][0]['message']['content'])

To make delicious pancakes, you'll need the following ingredients:

1. All-purpose flour: Use around 1 ½ cups of flour for a basic pancake recipe.
2. Baking powder: About 1 ½ teaspoons will help the pancakes rise.
3. Salt: Add a pinch of salt to enhance the flavor of the pancakes.
4. Sugar: If you prefer a slightly sweet pancake, add 1-2 tablespoons of sugar.
5. Milk: Use around 1 ¼ cups of milk to create a batter consistency.
6. Eggs: Include 1 or 2 eggs to bind the ingredients together.
7. Butter: Melt around 2 tablespoons of butter to add richness to the pancakes.
8. Cooking oil: A small amount of oil for greasing the pan.

These ingredients will make a classic pancake batter. Feel free to experiment and add flavors like vanilla extract, cinnamon, or even chocolate chips for variations.


Lets look at the whole response as JSON

In [270]:
response

<OpenAIObject chat.completion id=chatcmpl-7b9l1czjTkgWYadvkGsemR1lrUcGG at 0x7ff5001bb090> JSON: {
  "id": "chatcmpl-7b9l1czjTkgWYadvkGsemR1lrUcGG",
  "object": "chat.completion",
  "created": 1689089999,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "To make delicious pancakes, you'll need the following ingredients:\n\n1. All-purpose flour: Use around 1 \u00bd cups of flour for a basic pancake recipe.\n2. Baking powder: About 1 \u00bd teaspoons will help the pancakes rise.\n3. Salt: Add a pinch of salt to enhance the flavor of the pancakes.\n4. Sugar: If you prefer a slightly sweet pancake, add 1-2 tablespoons of sugar.\n5. Milk: Use around 1 \u00bc cups of milk to create a batter consistency.\n6. Eggs: Include 1 or 2 eggs to bind the ingredients together.\n7. Butter: Melt around 2 tablespoons of butter to add richness to the pancakes.\n8. Cooking oil: A small amount of oil for greasing the pa



Every response includes a finish_reason. The possible values for finish_reason are:

- stop: API returned complete model output.
- length: Incomplete model output due to max_tokens parameter or token limit.
- content_filter: Omitted content due to a flag from our content filters.
- null:API response still in progress or incomplete.

Consider setting max_tokens to a slightly higher value than normal such as 300 or 500. This ensures that the model doesn't stop generating text before it reaches the end of the message.

### Conversation history

Since it recommended Baking Powder, let's ask how much in another prompt:

In [271]:
response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[ 
    {"role": "user", "content": "How much Baking Powder do you recommend for this recipe?"}
  ]
)

In [272]:
print(response['choices'][0]['message']['content'])

To provide the correct amount of baking powder for your recipe, I would need specific details about the ingredients and quantities mentioned in the recipe. Could you please provide more information or share the recipe?


We can see that the history was not saved.

**Including conversation history is crucial** when user instructions refer to prior messages. We can do that by sending all the prompts, with its role, in the *messages* parameter.

In [273]:
response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a famous chef. Share your best cooking tips and tricks."},
    {"role": "user", "content": "What are the ingredients for making pancakes?"},
    {"role": "assistant", "content": "To make pancakes, you'll need flour, eggs, milk, baking powder, and a pinch of salt."}, # we shortened it for our demo purpose
    {"role": "user", "content": "Can I use almond milk instead of regular milk?"}
  ]
)

In [274]:
print(response['choices'][0]['message']['content'])

Absolutely! You can use almond milk as a substitute for regular milk in pancake recipes. Just keep in mind that almond milk may slightly alter the flavor and texture of the pancakes compared to using regular milk. Enjoy experimenting!


**Each user instruction relies on the prior messages in the conversation history to make sense.**

Since the language models don't have inherent memory of past requests, it's important to include the relevant conversation history in each API request. If the conversation exceeds the model's token limit, it may need to be truncated or shortened while ensuring the essential context and instructions are retained.

### Creating a basic conversation loop

This example demonstrates a conversation loop that performs the following tasks:

1. Takes console input continuously and formats it as the user's role content within the messages array.
2. Prints the model's response to the console and formats it as the assistant's role content within the messages array.

This approach ensures that each time a new question is asked, the ongoing conversation transcript is sent along with the latest question. Since the model lacks memory, it's crucial to include an updated transcript with each new question. Otherwise, the model may lose context from previous questions and answers.

In [275]:
message_history=[{"role": "system", "content": "You are a helpful assistant."}]

def gpt_response(inp, message_history):
    # We save the user's input
    message_history.append({"role": "user", "content": f"{inp}"})
    
    # Generate a response from the chatbot model
    completion = openai.ChatCompletion.create(
      model="gpt-3.5-turbo", 
      messages=message_history
    )
    
    # We save the assistant response
    message_history.append({"role": "assistant", "content": f"{completion.choices[0].message.content}"}) 
    
    return message_history

In [276]:
while(True):
    message_history = gpt_response(input("> "), message_history) # Lets ask by input different prompts
    print(message_history[-1]["content"]) # Print the last response

>  hi


Hello! How can I assist you today?


>  i want to cook milanesas argentinas


Great! Milanesas argentinas are delicious and quite simple to make. Here's a basic recipe to get you started:

Ingredients:
- 4 thinly sliced beef steaks (you can use veal, chicken, or pork as alternatives)
- 2 eggs
- 1 cup bread crumbs
- Salt and pepper to taste
- Vegetable oil for frying
- Lemon wedges for serving

Instructions:
1. Place the beef steaks between two pieces of plastic wrap and pound them until they are about 1/4-inch thick.
2. In a shallow dish, beat the eggs with salt and pepper.
3. Place the bread crumbs on a separate plate.
4. Dip each steak into the beaten eggs, coating it well on both sides.
5. Then dredge the steak in the bread crumbs, pressing gently to make sure the crumbs adhere.
6. Repeat this process with each steak.
7. In a large frying pan, heat enough vegetable oil to cover the bottom of the pan.
8. Fry the milanesas over medium heat for about 4-5 minutes on each side until golden brown.
9. Once cooked, transfer them to a plate lined with paper towels to 

>  how big should the eggs be? S, M or L?


For this recipe, you can use any size of eggs (S, M, or L). The size of the eggs won't affect the outcome of the dish. Use whichever size you have available.


KeyboardInterrupt: Interrupted by user

Let's check the message history to see if it has all the conversation.

In [277]:
message_history

[{'role': 'system', 'content': 'You are a helpful assistant.'},
 {'role': 'user', 'content': 'hi'},
 {'role': 'assistant', 'content': 'Hello! How can I assist you today?'},
 {'role': 'user', 'content': 'i want to cook milanesas argentinas'},
 {'role': 'assistant',
  'content': "Great! Milanesas argentinas are delicious and quite simple to make. Here's a basic recipe to get you started:\n\nIngredients:\n- 4 thinly sliced beef steaks (you can use veal, chicken, or pork as alternatives)\n- 2 eggs\n- 1 cup bread crumbs\n- Salt and pepper to taste\n- Vegetable oil for frying\n- Lemon wedges for serving\n\nInstructions:\n1. Place the beef steaks between two pieces of plastic wrap and pound them until they are about 1/4-inch thick.\n2. In a shallow dish, beat the eggs with salt and pepper.\n3. Place the bread crumbs on a separate plate.\n4. Dip each steak into the beaten eggs, coating it well on both sides.\n5. Then dredge the steak in the bread crumbs, pressing gently to make sure the crumbs

### Request parameters

Let's take a look at some request parameters.

- **Model**: Model type (e.g., GPT-3.5 turbo., GPT 4)
- **Prompt**: expects a list of messages in a chat-based format
- **Temperature (default 1)**: sampling temperature. Between 0 and 2. Higher value means more diverse and random output, lower is more while a lower value makes it more focused and deterministic.
- **Max tokens (default 16)**: limits the length of the generated response (max length)

In [278]:
# Lets rewrite the gpt_response function to include possible parameters

def gpt_response(inp, message_history, **params):
    # We save the user's input
    message_history.append({"role": "user", "content": f"{inp}"})
    
    # Generate a response from the chatbot model
    completion_params = {
        "model": "gpt-3.5-turbo", 
        "messages": message_history,
        **params  # Include additional parameters
    }

    completion = openai.ChatCompletion.create(**completion_params) # Include additional parameters
    
    # We save the assistant response
    message_history.append({"role": "assistant", "content": f"{completion.choices[0].message.content}"}) 
    
    return message_history

Lets explore different settings by using a max_tokens value of 100 and testing three temperature levels (0, 1, and 2) to generate responses from the model, completing the prompt "My favourite animal is."

In [281]:
message_history=[{"role": "system", "content": "Complete the prompt."}]

for i in [0,1,2]:
    message_history = gpt_response("My favourite animal is ", 
                                   message_history, 
                                   max_tokens=100, 
                                   temperature=i)
    print(message_history[-1]["content"]+"\n") # Print the last response

the majestic and intelligent dolphin.

the playful and adorable otter.

the sleek and glorious panther.



**top_p** (Defaults to 1)
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

Higher value gives access to more tokens (and more diversity) and a lower value is more deterministic. 

We generally recommend altering this or temperature but not both.



Lets explore different settings by using a max_tokens value of 100 and testing two top_p levels (0 and 1) to generate responses from the model, completing the prompt "My favourite animal is."

In [283]:
message_history=[{"role": "system", "content": "Complete the prompt."}]

for i in [0,1]:
    message_history = gpt_response("My favourite animal is ", 
                                   message_history, 
                                   max_tokens=100, 
                                   top_p=i)
    print(message_history[-1]["content"]+"\n") # Print the last response

the majestic and intelligent dolphin.

the playful and curious kitten.



**presence_penalty** (Defaults to 0)
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. **Higher values promote creativity by penalising the model when it uses predefined tokens.** 

**frequency_penalty** (Defaults to 0)
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. **Higher values penalise the model for repetition and reward variety.**


Lets explore different settings by using a max_tokens value of 100 and testing two presence penalty and frequency penalty levels (-2 and 2) to generate responses from the model, using the prompt "generate 20 ways to say you can't buy that because you're broke"

In [284]:
message_history=[]

for i in [-2,2]:
    message_history = gpt_response("generate 20 ways to say you can't buy that because you're broke",
                                   message_history, 
                                   max_tokens=100, 
                                   presence_penalty=i, 
                                   frequency_penalty=i)
    print(message_history[-1]["content"]+"\n")# Print the last response

1. I'm sorry, I'm financially unable to purchase that right now.
2. I'm afraid I'm unable to afford that at the moment.
3. I'm unable to buy that due to financial constraints.
4. I'm sorry, I'm financially unable to make that purchase.
5. I'm unable to purchase that I'm I'm I'm I I I I I I I I I I I I I I I I I I I I I I I I I I I

1. Unfortunately, I don't have the funds to buy that.
2. Regrettably, my current financial situation doesn't allow me to make this purchase.
3. Apologies, but my budget is too tight for buying that right now.
4. I'm afraid my wallet won't stretch far enough to afford it.
5. Due to a lack of money, purchasing that isn't an option for me at the moment.

6. Sadly, I can’t splurge on that because money



## Using Chat Completion for non-chat scenarios
The Chat Completion API is designed to work with multi-turn conversations, but it also works well for non-chat scenarios.

### Sentiment Analysis

Lets set up a sentiment analysis scenario where a user inputs a tweet, and the program generates responses using GPT, providing sentiment predictions until interrupted.

We will give it as *system* the task to decide whether a Tweet's sentiment is positive, neutral, or negative.
We include as user pre-promt the example "I loved the new Batman movie! Sentiment:", and an example assistant answer "Positive".

In [285]:
# Lets give an example of sentiment analysis
message_history=[{"role": "system", "content": "Decide whether a Tweet's sentiment is positive, neutral, or negative."},
             {"role": "user", "content": "Tweet: \"I loved the new Batman movie!\"\nSentiment:"},
              {"role": "assistant", "content": "Positive"}
  ]
print("Write a tweet: ")

while(True): 
    message_history = gpt_response(input("> "), 
                                   message_history, 
                                   temperature=0,
                                   max_tokens=60,
                                   frequency_penalty=0.5)
    print(message_history[-1]["content"])

Write a tweet: 


>  i woke up a bit stressed today


Negative


>  what a day"


Neutral


>  this is amazing


Positive


KeyboardInterrupt: Interrupted by user

### Language Translation

Lets set up a translation scenario where a user inputs a phrase to be translated into Spanish and Portuguese. The program generates responses using GPT, providing translations until interrupted.

We will give as a system pre-prompt *Translate this into Spanish, Portuguese, Italian*

In [288]:
message_history=[{"role": "system", "content": "Translate this into Spanish, Portuguese, Italian"}
  ]
print("Write a phrase to translate to Spanish, Portuguese and Italian: ")

while(True): 
    message_history = gpt_response(input("> "), message_history,temperature=0.3,max_tokens=60)
    print(message_history[-1]["content"])

Write a phrase to translate to Spanish, Portuguese and Italian: 


>  whats up


Spanish: ¿Qué tal?
Portuguese: Oi, tudo bem?
Italian: Ciao, come stai?


KeyboardInterrupt: Interrupted by user

# Extra: creating a chatbot with gradio for front-end UI

Gradio is a Python library that allows you to quickly create customizable UIs for machine learning models, or for any kind of Python function or code snippet, with just a few lines of code. It simplifies the process of deploying and sharing models or code by generating interactive interfaces for input and output data.

To create a chatbot with Gradio for the front-end user interface, follow these steps:

1. Install the necessary packages

In [None]:
#!pip install gradio openai

2. Import the required libraries

In [289]:
import gradio as gr
# import openai

3. Set up your OpenAI API credentials by assigning your API key to openai.api_key

In [None]:
# openai.api_key = "YOUR_API_KEY"

4. Define a function that generates chatbot responses based on user input

In [299]:
# Set up the conversation history with user and assistant messages
message_history = [{"role": "system", "content": f"Respond my prompts in a Harry Potter style"},
                   {"role": "assistant", "content": f"OK"}]
    
def gpt_response_ui(inp):
    global message_history

    message_history = gpt_response(inp, message_history, max_tokens = 50)

    # Get pairs (as tuples) of msg["content"] from message history,representing one exchange between the user and the chatbot, skipping the pre-prompt
    response = [(message_history[i]["content"], message_history[i+1]["content"]) for i in range(2, len(message_history)-1, 2)]  # convert to tuples of list
    return response

5. Define the Gradio interface

In [300]:
with gr.Blocks() as demo: #creates a Gradio interface

    chatbot = gr.Chatbot() #creates a chatbot instance

    with gr.Row(): #creates a row within the Gradio interface to contain components
        txt = gr.Textbox(show_label=False, placeholder="Enter text and press enter") #text input field
        # `show_label=False` parameter hides the label associated with the textbox
    
    txt.submit(gpt_response_ui, txt, chatbot) #sets the submit action to the `gpt_response` function
    txt.submit(lambda :"", None, txt) #this clears the textbox when the user submits their input

6. Run the Gradio interface

In [301]:
demo.launch(share=True) #To create a public link, set `share=True` in `launch()`.

Running on local URL:  http://127.0.0.1:7880
Running on public URL: https://171bfbe57efd741bc3.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




Source Inspiration: MIT License - Copyright (c) 2023 Harrison