<div style="width: 100%; overflow: hidden;">
    <div style="width: 150px; float: left;"> <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_ball.png" alt="Data For Science, Inc" align="left" border="0" width=150px> </div>
    <div style="float: left; margin-left: 10px;"> <h1>Generative AI with OpenAI API</h1>
<h1>GPT Models</h1>
        <p>Bruno Gonçalves<br/>
        <a href="http://www.data4sci.com/">www.data4sci.com</a><br/>
            @bgoncalves, @data4sci</p></div>
</div>

In [1]:
from collections import Counter
from pprint import pprint
from datetime import datetime
import json

import pandas as pd
import numpy as np

import matplotlib
import matplotlib.pyplot as plt 

import openai
from openai import OpenAI

import termcolor
from termcolor import colored

import os
import gzip

import tqdm as tq
from tqdm.notebook import tqdm

import watermark

%load_ext watermark
%matplotlib inline

We start by printing out the versions of the libraries we're using for future reference

In [2]:
%watermark -n -v -m -g -iv

Python implementation: CPython
Python version       : 3.11.7
IPython version      : 8.12.3

Compiler    : Clang 14.0.6 
OS          : Darwin
Release     : 23.4.0
Machine     : arm64
Processor   : arm
CPU cores   : 16
Architecture: 64bit

Git hash: 00e27069c4e567040e2ae5b1b5259eb0a02807ab

openai    : 1.20.0
watermark : 2.4.3
matplotlib: 3.8.0
pandas    : 2.1.4
json      : 2.0.9
termcolor : 2.4.0
tqdm      : 4.66.2
numpy     : 1.26.4



Load default figure style

In [3]:
plt.style.use('d4sci.mplstyle')
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

# Basic Usage

The first step is generate API key on the OpenAI website and store it as the "OPENAI_API_KEY" variable in your local environment. Without it we won't be able to do anything. You can find your API key in your using settings: https://help.openai.com/en/articles/4936850-where-do-i-find-my-secret-api-key

Then we are ready to instantiate the client

In [1]:
client = OpenAI()

NameError: name 'OpenAI' is not defined

We start by getting a list of supported models.

In [5]:
model_list = json.loads(client.models.list().json())["data"]

In total we have 33 models

In [6]:
len(model_list)

33

Along with some information about each model...

In [7]:
model_list[:3]

[{'id': 'dall-e-3',
  'created': 1698785189,
  'object': 'model',
  'owned_by': 'system'},
 {'id': 'gpt-4-1106-preview',
  'created': 1698957206,
  'object': 'model',
  'owned_by': 'system'},
 {'id': 'whisper-1',
  'created': 1677532384,
  'object': 'model',
  'owned_by': 'openai-internal'}]

But let's just get a list of model names

In [8]:
print("\n".join(sorted([model["id"] for model in model_list])))

babbage-002
dall-e-2
dall-e-3
davinci-002
gpt-3.5-turbo
gpt-3.5-turbo-0125
gpt-3.5-turbo-0301
gpt-3.5-turbo-0613
gpt-3.5-turbo-1106
gpt-3.5-turbo-16k
gpt-3.5-turbo-16k-0613
gpt-3.5-turbo-instruct
gpt-3.5-turbo-instruct-0914
gpt-4
gpt-4-0125-preview
gpt-4-0613
gpt-4-1106-preview
gpt-4-1106-vision-preview
gpt-4-turbo
gpt-4-turbo-2024-04-09
gpt-4-turbo-preview
gpt-4-vision-preview
gpt-4o
gpt-4o-2024-05-13
gpt-4o-test-shared
text-embedding-3-large
text-embedding-3-small
text-embedding-ada-002
tts-1
tts-1-1106
tts-1-hd
tts-1-hd-1106
whisper-1


## Basic Prompt

The recommended model for exploration is `gpt-3.5-turbo`, so we'll stick with it for now. The basic setup is relatively straightforward:

In [9]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
        {
            "role": "user", 
            "content": "What was Superman's weakness?"
        },
    ]
)

Which produces a response object

In [10]:
type(response)

openai.types.chat.chat_completion.ChatCompletion

Which we can treat as a named tuple

The model answer can be found in the "message" dictionary inside the "choices" list

In [11]:
response.choices[0]

Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Superman's weakness is the radioactive mineral known as kryptonite. When exposed to kryptonite, Superman loses his superhuman abilities and becomes vulnerable to harm.", role='assistant', function_call=None, tool_calls=None))

To request multiple answers, we must include the `n` parameter with the number of answers we want

In [12]:
%%time
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "What are the different kinds of Kryptonite?"},
    ],
    n=3
)

CPU times: user 7.65 ms, sys: 2.16 ms, total: 9.81 ms
Wall time: 5.48 s


And we can access each of the answers individually int he choices list

In [13]:
for output in response.choices:
    print("==========")
    print(output.message.role.title()) 
    print("==========")
    print(output.message.content)
    print("==========\n")

Assistant
There are several different kinds of Kryptonite, each with its own specific effects on Superman:

1. Green Kryptonite - The most common form of Kryptonite, it is deadly to Superman and weakens him both physically and mentally. Exposure to green Kryptonite can lead to nausea, weakness, and even death if left untreated.

2. Red Kryptonite - Red Kryptonite has unpredictable effects on Superman, causing temporary changes in his powers and behavior. It can create temporary mutations, alter his personality, or even take away his powers entirely for a short period of time.

3. Blue Kryptonite - Blue Kryptonite is harmless to Superman but affects Bizarro, a twisted version of Superman, in the same way that green Kryptonite affects the original Superman.

4. Gold Kryptonite - Gold Kryptonite permanently removes Superman's powers, rendering him completely human and vulnerable to harm.

5. White Kryptonite - White Kryptonite is deadly to plant life, specifically killing all plant-based 

In [14]:
response.usage

CompletionUsage(completion_tokens=814, prompt_tokens=17, total_tokens=831)

# Temperature

In [16]:
%%time
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Tell me a short story"},
    ],
    temperature=1.5
)

CPU times: user 9 ms, sys: 2.92 ms, total: 11.9 ms
Wall time: 5.37 s


In [17]:
print(response.choices[0].message.content)

Once upon a time, there was a little girl named Lily who lived in a small town at the edge of a lush forest. Lily loved wandering through the woods, chasing butterflies, and listening to the birds chirping in the trees.

One day, while out exploring, Lily stumbled upon a hidden glen filled with colorful flowers and sparkling stream. As she gazed in awe at the beauty around her, she noticed a small creature hiding behind a rock. It was a fairy, with shimmering wings and a mischievous twinkle in her eye.

The fairy introduced herself as Luna, and told Lily that she was the guardian of the secret glen. She explained that the glen was a special place, full of magic and wonder, that only those with pure hearts could find.

Lily and Luna spent the day together, dancing in meadows and flying through the trees. As the sun began to set, Luna revealed a secret spring hidden in a grove of silver birch trees. She told Lily that anyone who drank from the spring would be granted a wish.

Lily closed

In [18]:
%%time
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Tell me a short story"},
    ],
    temperature=0
)

CPU times: user 9.75 ms, sys: 2.51 ms, total: 12.3 ms
Wall time: 4.51 s


In [19]:
print(response.choices[0].message.content)

Once upon a time, in a small village nestled in the mountains, there lived a young girl named Lily. She was known for her kindness and generosity, always willing to help those in need.

One day, a terrible storm hit the village, causing widespread damage and leaving many families homeless. Lily knew she had to do something to help. She gathered her friends and together they started a relief effort, collecting food, clothing, and supplies for those affected by the storm.

As they worked tirelessly to distribute the donations, word of Lily's kindness spread throughout the village. Soon, people from neighboring towns came to help, and the relief effort grew into a full-fledged community project.

Thanks to Lily's leadership and the support of the villagers, the affected families were able to rebuild their homes and their lives. The village came together in a way they never had before, united by a common goal of helping those in need.

From that day on, Lily was known as a hero in the vill

# Function Calls

In [20]:
def chat(messages, functions):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        # Define the functions the model is allowed to use
        functions=functions
    )
    
    return response

In [21]:
def pretty_print_conversation(messages):
    role_to_color = {
        "system": "red",
        "user": "green",
        "assistant": "blue",
        "function": "magenta",
    }
    
    for message in messages:
        print(message)
        if message["role"] == "system":
            print(colored(f"system: {message['content']}\n", role_to_color[message['role']]))
        elif message["role"] == "user":
            print(colored(f"user: {message['content']}\n", role_to_color[message['role']]))
        elif message["role"] == "assistant" and message['function_call']:
            print(colored(f"assistant: {message['function_call']}\n", role_to_color[message['role']]))
        elif message["role"] == "assistant" and not message['function_call']:
            print(colored(f"assistant: {message['content']}\n", role_to_color[message['role']]))
        elif message["role"] == "function":
            print(colored(f"function ({message.name}): {message.content}\n", role_to_color[message.role]))


Let's create some function specifications to interface with a hypothetical weather API. We'll pass these function specification to the Chat Completions API in order to generate function arguments that adhere to the specification.

In [22]:
functions = [
    {
        "name": "get_current_weather",
        "description": "Get the current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "format": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "The temperature unit to use. Infer this from the users location.",
                },
            },
            "required": ["location", "format"],
        },
    },
]

If we prompt the model about the current weather, it will respond with some clarifying questions.

In [27]:
messages = []

messages.append(
    {"role": "system", 
     "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."
    })

messages.append(
    {"role": "user", 
     "content": "What's the weather like today"
    })

In [28]:
chat_response = chat(messages, functions=functions)
assistant_message = chat_response.choices[0].message
messages.append({
 "role":  assistant_message.role,
 "content":  assistant_message.content,
 "function_call":  assistant_message.function_call,
})

In [29]:
pretty_print_conversation(messages)

{'role': 'system', 'content': "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."}
[31msystem: Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.
[0m
{'role': 'user', 'content': "What's the weather like today"}
[32muser: What's the weather like today
[0m
{'role': 'assistant', 'content': 'Sure, could you please provide me with your current location or the city you would like to know the weather for?', 'function_call': None}
[34massistant: Sure, could you please provide me with your current location or the city you would like to know the weather for?
[0m


Once we provide the missing information, it will generate the appropriate function arguments for us.

In [30]:
messages.append(
    {"role": "user", 
     "content": "I'm in New York, NY."
    })

In [31]:
chat_response = chat(messages, functions=functions)
assistant_message = chat_response.choices[0].message
messages.append({
 "role":  assistant_message.role,
 "content":  assistant_message.content,
 "function_call":  assistant_message.function_call,
})

In [32]:
pretty_print_conversation(messages)

{'role': 'system', 'content': "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."}
[31msystem: Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.
[0m
{'role': 'user', 'content': "What's the weather like today"}
[32muser: What's the weather like today
[0m
{'role': 'assistant', 'content': 'Sure, could you please provide me with your current location or the city you would like to know the weather for?', 'function_call': None}
[34massistant: Sure, could you please provide me with your current location or the city you would like to know the weather for?
[0m
{'role': 'user', 'content': "I'm in New York, NY."}
[32muser: I'm in New York, NY.
[0m
{'role': 'assistant', 'content': None, 'function_call': FunctionCall(arguments='{"location":"New York, NY","format":"celsius"}', name='get_current_weather')}
[34massistant: FunctionCall(arguments='{"location"

## Few-shot prompting

We can also provide several examples of mappings between input and output.

In [33]:
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful, pattern-following assistant."},
        {"role": "user", "content": "Help me translate the following corporate jargon into plain English."},
        {"role": "assistant", "content": "Sure, I'd be happy to!"},
        {"role": "user", "content": "New synergies will help drive top-line growth."},
        {"role": "assistant", "content": "Things working well together will increase revenue."},
        {"role": "user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
        {"role": "assistant", "content": "Let's talk later when we're less busy about how to do better."},
        {"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."},
    ],
    temperature=0,
)

print(response.choices[0].message.content)

This last-minute change means we can't spend excessive time on the client's project.


# Formatted output

In [34]:
%%time
userInput = "blueberry pancakes"

prompt = """return a recipe for %s.
        Provide your response as a JSON object with the following schema:
        {"dish": "%s", "ingredients": ["", "", ...],
        "instructions": ["", "", ... ]}""" % (userInput, userInput)

response = client.chat.completions.create(
          model = "gpt-3.5-turbo",
          messages = [
            { "role": "system", "content": "You are a helpful recipe assistant."},
            { "role": "user",   "content": prompt }
          ]
)

CPU times: user 9.26 ms, sys: 2.05 ms, total: 11.3 ms
Wall time: 4.28 s


In [35]:
json_output = response.choices[0].message.content

In [36]:
output = json.loads(json_output)

In [37]:
output["ingredients"]

['1 cup all-purpose flour',
 '2 tbsp sugar',
 '1 tbsp baking powder',
 '1/2 tsp salt',
 '1 cup milk',
 '1 large egg',
 '2 tbsp melted butter',
 '1 cup fresh blueberries']

In [38]:
output["instructions"]

['In a large mixing bowl, combine the flour, sugar, baking powder, and salt.',
 'In a separate bowl, whisk together the milk, egg, and melted butter.',
 'Pour the wet ingredients into the dry ingredients and mix until just combined. Be careful not to overmix.',
 'Gently fold in the blueberries.',
 'Heat a lightly greased skillet or griddle over medium heat.',
 'Pour 1/4 cup of batter onto the skillet for each pancake.',
 'Cook until bubbles form on the surface of the pancake, then flip and cook for another minute or until golden brown.',
 'Repeat with the remaining batter.',
 'Serve warm with maple syrup and extra blueberries if desired. Enjoy!']

# Translation

In [39]:
response = client.chat.completions.create(
    model='gpt-3.5-turbo',
    messages=[{"role": "system", "content": "You're a professional English-Italian translator."}, 
              {"role": "user", "content": "Translate 'Be the change that you wish to see in the world.' into Italian"}],
    temperature=0,
)

In [40]:
response.choices[0].message.content

'"Sii il cambiamento che desideri vedere nel mondo."'

# Process unstructured information

Inspired by https://platform.openai.com/examples/default-parse-data

In [42]:
prompt = """There are many fruits that were found on the recently discovered planet Goocrux. 
There are neoskizzles that grow there, which are purple and taste like candy. There are also 
loheckles, which are a grayish blue fruit and are very tart, a little bit like a lemon. Pounits 
are a bright green color and are more savory than sweet. There are also plenty of loopnovas which 
are a neon pink flavor and taste like cotton candy. Finally, there are fruits called glowls, which 
have a very sour and bitter taste which is acidic and caustic, and a pale orange tinge to them."""

In [43]:
response = client.chat.completions.create(
    model='gpt-3.5-turbo',
    messages=[
        {"role": "system", 
         "content": "You will be provided with unstructured data, and your task is to parse it into CSV format."}, 
        {"role": "user", 
         "content": prompt}],
    temperature=0,
)

In [44]:
print(response.choices[0].message.content)

Fruit,Color,Flavor
neoskizzles,Purple,Candy
loheckles,Grayish blue,Tart
pounits,Bright green,Savory
loopnovas,Neon pink,Cotton candy
glowls,Pale orange,Sour and bitter


In [45]:
response = client.chat.completions.create(
    model='gpt-3.5-turbo',
    messages=[{"role": "system", "content": """
            Read this paragraph 
            
            `%s` 
            
            and use it to answer some questions.""" % prompt}, 
              {"role": "user", "content": "What are pounits?"}],
    temperature=0,
)

In [46]:
print(response.choices[0].message.content)

Pounits are bright green fruits found on the planet Goocrux. They are described as more savory than sweet.


<center>
     <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_full.png" alt="Data For Science, Inc" align="center" border="0" width=300px> 
</center>