<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Working-with-the-GPT-API" data-toc-modified-id="Working-with-the-GPT-API-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Working with the GPT API</a></span><ul class="toc-item"><li><span><a href="#Getting-an-API-KEY" data-toc-modified-id="Getting-an-API-KEY-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Getting an API KEY</a></span></li><li><span><a href="#OpenAI-Python-Package-Installation" data-toc-modified-id="OpenAI-Python-Package-Installation-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>OpenAI Python Package Installation</a></span></li><li><span><a href="#Chat-Completions-API" data-toc-modified-id="Chat-Completions-API-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Chat Completions API</a></span><ul class="toc-item"><li><span><a href="#Chat-completions-response-format" data-toc-modified-id="Chat-completions-response-format-1.3.1"><span class="toc-item-num">1.3.1&nbsp;&nbsp;</span>Chat completions response format</a></span></li><li><span><a href="#Conversation-history" data-toc-modified-id="Conversation-history-1.3.2"><span class="toc-item-num">1.3.2&nbsp;&nbsp;</span>Conversation history</a></span></li><li><span><a href="#Creating-a-basic-conversation-loop" data-toc-modified-id="Creating-a-basic-conversation-loop-1.3.3"><span class="toc-item-num">1.3.3&nbsp;&nbsp;</span>Creating a basic conversation loop</a></span></li><li><span><a href="#Request-parameters" data-toc-modified-id="Request-parameters-1.3.4"><span class="toc-item-num">1.3.4&nbsp;&nbsp;</span>Request parameters</a></span></li></ul></li><li><span><a href="#Using-Chat-Completion-for-non-chat-scenarios" data-toc-modified-id="Using-Chat-Completion-for-non-chat-scenarios-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Using Chat Completion for non-chat scenarios</a></span><ul class="toc-item"><li><span><a href="#Sentiment-Analysis" data-toc-modified-id="Sentiment-Analysis-1.4.1"><span class="toc-item-num">1.4.1&nbsp;&nbsp;</span>Sentiment Analysis</a></span></li><li><span><a href="#Language-Translation" data-toc-modified-id="Language-Translation-1.4.2"><span class="toc-item-num">1.4.2&nbsp;&nbsp;</span>Language Translation</a></span></li></ul></li></ul></li><li><span><a href="#Extra:-creating-a-chatbot-with-gradio-for-front-end-UI" data-toc-modified-id="Extra:-creating-a-chatbot-with-gradio-for-front-end-UI-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Extra: creating a chatbot with gradio for front-end UI</a></span></li></ul></div>

# Working with the GPT API

## Getting an API KEY

The API empowers you with greater control and versatility to work with the GPT model (model inside ChatGPT). It also allows seamless integration with other applications.

To access the model through the API, you will need an API key. For this demo, you'll be using your LT's API KEY.

*To obtain your own API key, you'll need to create an account and set up billing. You can create your account at https://platform.openai.com/.*

In this demo, we will read the api key from a txt file. Create a `key.txt` and paste the key your lead teacher gave you in that file.

Alternatively, save it as an environment variable and read it as it follows

```python
import os

api_key = os.getenv("OPENAI_API_KEY")
```

In [1]:
with open("key.txt", "r") as file:
    api_key = file.read().strip()


## OpenAI Python Package Installation

To utilize the GPT API, you'll need to have the OpenAI Python package installed.

You can easily install it by running the command pip install --upgrade openai. *Adding the --upgrade flag ensures that you have the most up-to-date version, in case you installed openai before, as the GPT API is a recently introduced feature.*

In [2]:
!pip install --upgrade openai

Collecting openai
  Using cached openai-1.64.0-py3-none-any.whl.metadata (27 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Downloading jiter-0.8.2-cp38-cp38-macosx_11_0_arm64.whl.metadata (5.2 kB)
Using cached openai-1.64.0-py3-none-any.whl (472 kB)
Downloading jiter-0.8.2-cp38-cp38-macosx_11_0_arm64.whl (300 kB)
Installing collected packages: jiter, openai
  Attempting uninstall: openai
    Found existing installation: openai 1.31.0
    Uninstalling openai-1.31.0:
      Successfully uninstalled openai-1.31.0
Successfully installed jiter-0.8.2 openai-1.64.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [3]:
import openai

# load and set our key
openai.api_key = api_key

## Chat Completions API

To use a GPT model via the OpenAI API, you’ll send a request containing the inputs and your API key, and receive a response containing the model’s output.

As of July 2023 there are two main APIs endpoints to work with GPT models.
- Completions API endpoint: only for the older legacy models
- Chat Completions API endpoint: to access the latest models, gpt-4 and gpt-3.5-turbo.

Chat models in Chat Completions API take as mandatory parameters:
- **List of messages as input**
- **Model**: we will use gpt-3.5-turbo

They return a **model-generated message as output**.

An example API call looks as follows:

In [4]:
# Use the new API format
client = openai.OpenAI(api_key=api_key)


response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a famous chef. Share your best cooking tips and tricks."},
        {"role": "user", "content": "What are the ingredients for making pancakes?"}
    ]
)



In conversations using the Chat Completion API, each message has a **role ("system," "user," or "assistant").** Typically, a conversation starts with a system message to set the assistant's behavior, followed by alternating user and assistant messages.
- The system message is optional and can be used to customize the assistant's personality or provide specific instructions
- User messages contain requests or comments (prompts)
- Assistant messages store previous assistant responses or serve as examples of desired behavior.

### Chat completions response format

An example chat completions API response looks as follows:

In [5]:
response

ChatCompletion(id='chatcmpl-B4uXiIfGxhrsQEZJQiwqcLHvyRiR1', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The basic ingredients needed to make pancakes are:\n\n- All-purpose flour\n- Baking powder\n- Salt\n- Sugar\n- Milk\n- Eggs\n- Butter or oil\n\nYou can also customize your pancakes by adding ingredients like blueberries, chocolate chips, bananas, or nuts. Remember to have maple syrup, butter, or your favorite pancake toppings on hand as well!', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1740510362, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier='default', system_fingerprint=None, usage=CompletionUsage(completion_tokens=77, prompt_tokens=33, total_tokens=110, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, 

In Python, the assistant’s reply can be extracted with response['choices'][0]['message']['content'].

In [6]:
# Print the AI's response
print(response.choices[0].message.content)

The basic ingredients needed to make pancakes are:

- All-purpose flour
- Baking powder
- Salt
- Sugar
- Milk
- Eggs
- Butter or oil

You can also customize your pancakes by adding ingredients like blueberries, chocolate chips, bananas, or nuts. Remember to have maple syrup, butter, or your favorite pancake toppings on hand as well!




Every response includes a finish_reason. The possible values for finish_reason are:

- stop: API returned complete model output.
- length: Incomplete model output due to max_tokens parameter or token limit.
- content_filter: Omitted content due to a flag from our content filters.
- null: API response still in progress or incomplete.



### Conversation history

Since it recommended Baking Powder, let's ask how much in another prompt:

In [7]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "user", "content": "How much Baking Powder do you recommend for this recipe?"}
  ]
)

In [8]:
print(response.choices[0].message.content)

I apologize, but I need to know the specific recipe you are referring to in order to provide an accurate recommendation for the amount of baking powder needed. Can you please provide more information or the specific recipe you are inquiring about?


We can see that the history was not saved.

**Including conversation history is crucial** when user instructions refer to prior messages. We can do that by sending all the prompts, with its role, in the *messages* parameter.

In [9]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a famous chef. Share your best cooking tips and tricks."},
    {"role": "user", "content": "What are the ingredients for making pancakes?"},
    {"role": "assistant", "content": "To make pancakes, you'll need flour, eggs, milk, baking powder, and a pinch of salt."}, # we shortened it for our demo purpose
    {"role": "user", "content": "Can I use almond milk instead of regular milk?"}
  ]
)

In [10]:
print(response.choices[0].message.content)

Yes, you can absolutely use almond milk as a substitute for regular milk in pancake batter. Just note that almond milk may lend a slightly nutty flavor to the pancakes, which can be a nice addition depending on your preferences.


**Each user instruction relies on the prior messages in the conversation history to make sense.**

Since the language models don't have inherent memory of past requests, it's important to include the relevant conversation history in each API request. If the conversation exceeds the model's token limit, it may need to be truncated or shortened while ensuring the essential context and instructions are retained.

### Creating a basic conversation loop

This example demonstrates a conversation loop that performs the following tasks:

1. Takes console input continuously and formats it as the user's role content within the messages array.
2. Prints the model's response to the console and formats it as the assistant's role content within the messages array.

This approach ensures that each time a new question is asked, the ongoing conversation transcript is sent along with the latest question. Since the model lacks memory, it's crucial to include an updated transcript with each new question. Otherwise, the model may lose context from previous questions and answers.

In [11]:
message_history=[{"role": "system", "content": "You are a helpful assistant."}]

def gpt_response(inp, message_history):
    # We save the user's input
    message_history.append({"role": "user", "content": f"{inp}"})

    # Generate a response from the chatbot model
    completion = client.chat.completions.create(
      model="gpt-3.5-turbo",
      messages=message_history
    )

    # We save the assistant response
    message_history.append({"role": "assistant", "content": completion.choices[0].message.content})


    return message_history

In [12]:
while True:
    user_input = input("> ")
    if user_input.lower() == "exit":
        print("Exiting chat...")
        break  # Exit loop

    message_history = gpt_response(user_input, message_history)
    print(message_history[-1]["content"])  # Print assistant's last response

Data analytics is the process of examining large datasets to uncover hidden patterns, correlations, trends, and insights that can help businesses make more informed decisions. By utilizing various tools, techniques, and algorithms, data analysts can extract valuable information from data sources such as databases, spreadsheets, and big data platforms. Data analytics plays a crucial role in areas such as business intelligence, marketing optimization, risk management, and performance evaluation. It helps organizations understand their customers better, improve operational efficiency, identify new opportunities, and mitigate potential risks.
Web development and data analytics are two distinct fields within the realm of technology.

Web development involves the design, creation, and maintenance of websites and web applications. Web developers use programming languages such as HTML, CSS, and JavaScript to build websites that are functional, interactive, and visually appealing. They work on 

Let's check the message history to see if it has all the conversation.

In [None]:
message_history

### Request parameters

Let's take a look at some request parameters.

- **Model**: Model type (e.g., GPT-3.5 turbo., GPT 4)
- **Prompt**: expects a list of messages in a chat-based format
- **Temperature (default 1)**: sampling temperature. Between 0 and 2. Higher value means more diverse and random output, while a lower value makes it more focused and deterministic.
- **Max tokens (default 16)**: limits the length of the generated response (max length)

In [13]:
# Lets rewrite the gpt_response function to include possible parameters

def gpt_response(inp, message_history, **params):
    # We save the user's input
    message_history.append({"role": "user", "content": f"{inp}"})

    # Generate a response from the chatbot model
    completion_params = {
        "model": "gpt-3.5-turbo",
        "messages": message_history,
        **params  # Include additional parameters
    }

    completion = client.chat.completions.create(**completion_params) # Include additional parameters

    # We save the assistant response
    message_history.append({"role": "assistant", "content": f"{completion.choices[0].message.content}"})

    return message_history

Lets explore different settings by using a max_tokens value of 100 and testing three temperature levels (0, 1, and 2) to generate responses from the model, completing the prompt "My favourite animal is."

In [None]:
message_history=[{"role": "system", "content": "Complete the prompt."}]

for i in [0,1,2]:
    message_history = gpt_response("My favourite animal is ",
                                   message_history,
                                   max_tokens=100,
                                   temperature=i)
    print(message_history[-1]["content"]+"\n") # Print the last response

- **top_p** (Defaults to 1)
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

A higher value gives access to more tokens (and more diversity) and a lower value is more deterministic.

We generally recommend altering this or temperature but not both.



Lets explore different settings by using a max_tokens value of 100 and testing two top_p levels (0 and 1) to generate responses from the model, completing the prompt "My favourite animal is."

In [None]:
message_history=[{"role": "system", "content": "Complete the prompt."}]

for i in [0,1]:
    message_history = gpt_response("My favourite animal is ",
                                   message_history,
                                   max_tokens=100,
                                   top_p=i)
    print(message_history[-1]["content"]+"\n") # Print the last response

- **presence_penalty** (Defaults to 0)
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. **Higher values promote creativity by penalising the model when it uses predefined tokens.**

- **frequency_penalty** (Defaults to 0)
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. **Higher values penalise the model for repetition and reward variety.**


Lets explore different settings by using a max_tokens value of 100 and testing two presence penalty and frequency penalty levels (-2 and 2) to generate responses from the model, using the prompt "generate 20 ways to say you can't buy that because you're broke"

In [None]:
message_history=[]

for i in [-2,2]:
    message_history = gpt_response("generate 20 ways to say you can't buy that because you're broke",
                                   message_history,
                                   max_tokens=100,
                                   presence_penalty=i,
                                   frequency_penalty=i)
    print(message_history[-1]["content"]+"\n")# Print the last response

## Using Chat Completion for non-chat scenarios
The Chat Completion API is designed to work with multi-turn conversations, but it also works well for non-chat scenarios.

### Sentiment Analysis

Lets set up a sentiment analysis scenario where a user inputs a tweet, and the program generates responses using GPT, providing sentiment predictions until interrupted.

We will give the role *system* the task to decide whether a Tweet's sentiment is positive, neutral, or negative.
We include as a user pre-prompt the example "I loved the new Batman movie! Sentiment:", and an example assistant answer "Positive".

In [14]:
# Lets give an example of sentiment analysis
message_history=[{"role": "system", "content": "Decide whether a Tweet's sentiment is positive, neutral, or negative."},
             {"role": "user", "content": "Tweet: \"I loved the new Batman movie!\"\nSentiment:"},
              {"role": "assistant", "content": "Positive"}
  ]
print("Write a tweet: ")

while(True):

    user_input = input("> ")
    if user_input.lower() == "exit":
      print("Exiting chat...")
      break  # Exit loop
    message_history = gpt_response(input("> "),
                                   message_history,
                                   temperature=0,
                                   max_tokens=60,
                                   frequency_penalty=0.5)
    print(message_history[-1]["content"])

Write a tweet: 
Neutral
Positive
Exiting chat...


### Language Translation

Lets set up a translation scenario where a user inputs a phrase to be translated into Spanish and Portuguese. The program generates responses using GPT, providing translations until interrupted.

We will give as a system pre-prompt *Translate this into Spanish, Portuguese, Italian*

In [None]:
message_history=[{"role": "system", "content": "Translate this into Spanish, Portuguese, Italian"}
  ]
print("Write a phrase to translate to Spanish, Portuguese and Italian: ")

while(True):

    user_input = input("> ")

    if user_input.lower() == "exit":
      print("Exiting chat...")
      break  # Exit loop
    
    message_history = gpt_response(user_input, message_history,temperature=0.3,max_tokens=60)
    print(message_history[-1]["content"])

# Extra: creating a chatbot with gradio for front-end UI

Gradio is a Python library that allows you to quickly create customizable UIs for machine learning models, or for any kind of Python function or code snippet, with just a few lines of code. It simplifies the process of deploying and sharing models or code by generating interactive interfaces for input and output data.

To create a chatbot with Gradio for the front-end user interface, follow these steps:

1. Install the necessary packages

In [15]:
!pip install gradio openai

Collecting gradio
  Downloading gradio-4.44.1-py3-none-any.whl.metadata (15 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Using cached aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting fastapi<1.0 (from gradio)
  Using cached fastapi-0.115.8-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Using cached ffmpy-0.5.0-py3-none-any.whl.metadata (3.0 kB)
Collecting gradio-client==1.3.0 (from gradio)
  Downloading gradio_client-1.3.0-py3-none-any.whl.metadata (7.1 kB)
Collecting huggingface-hub>=0.19.3 (from gradio)
  Using cached huggingface_hub-0.29.1-py3-none-any.whl.metadata (13 kB)
Collecting pydub (from gradio)
  Using cached pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.9 (from gradio)
  Using cached python_multipart-0.0.20-py3-none-any.whl.metadata (1.8 kB)
Collecting ruff>=0.2.2 (from gradio)
  Using cached ruff-0.9.7-py3-none-macosx_11_0_arm64.whl.metadata (25 kB)
Collecting semantic-version~=2.0 (from gradio

2. Import the required libraries

In [16]:
import gradio as gr


3. Set up your OpenAI API credentials by assigning your API key to openai.api_key

In [None]:
# openai.api_key = "YOUR_API_KEY"

4. Define a function that generates chatbot responses based on user input

In [17]:
# Set up the conversation history with user and assistant messages
message_history = [{"role": "system", "content": f"Respond my prompts in a Harry Potter style"},
                   {"role": "assistant", "content": f"OK"}]

def gpt_response_ui(inp):
    global message_history

    message_history = gpt_response(inp, message_history, max_tokens = 50)

    # Get pairs (as tuples) of msg["content"] from message history,representing one exchange between the user and the chatbot, skipping the pre-prompt
    response = [(message_history[i]["content"], message_history[i+1]["content"]) for i in range(2, len(message_history)-1, 2)]  # convert to tuples of list
    return response

5. Define the Gradio interface

In [18]:
with gr.Blocks() as demo: #creates a Gradio interface

    chatbot = gr.Chatbot() #creates a chatbot instance

    with gr.Row(): #creates a row within the Gradio interface to contain components
        txt = gr.Textbox(show_label=False, placeholder="Enter text and press enter") #text input field
        # `show_label=False` parameter hides the label associated with the textbox

    txt.submit(gpt_response_ui, txt, chatbot) #sets the submit action to the `gpt_response` function
    txt.submit(lambda :"", None, txt) #this clears the textbox when the user submits their input

6. Run the Gradio interface

In [19]:
demo.launch(share=True) #To create a public link, set `share=True` in `launch()`.

Running on local URL:  http://127.0.0.1:7861
Running on public URL: https://eaa041eddafb3b48b8.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




Source Inspiration: MIT License - Copyright (c) 2023 Harrison