<div style="width: 100%; overflow: hidden;">
    <div style="width: 150px; float: left;"> <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_ball.png" alt="Data For Science, Inc" align="left" border="0" width=150px> </div>
    <div style="float: left; margin-left: 10px;"> <h1>Generative AI with OpenAI API</h1>
<h1>GPT Models</h1>
        <p>Bruno Gonçalves<br/>
        <a href="http://www.data4sci.com/">www.data4sci.com</a><br/>
            @bgoncalves, @data4sci</p></div>
</div>

In [1]:
from collections import Counter
from pprint import pprint
from datetime import datetime
import json

import pandas as pd
import numpy as np

import matplotlib
import matplotlib.pyplot as plt 

import openai
import termcolor
from termcolor import colored

import os
import gzip

import tqdm as tq
from tqdm.notebook import tqdm

import watermark

%load_ext watermark
%matplotlib inline

We start by printing out the versions of the libraries we're using for future reference

In [2]:
%watermark -n -v -m -g -iv

Python implementation: CPython
Python version       : 3.10.9
IPython version      : 8.10.0

Compiler    : Clang 14.0.6 
OS          : Darwin
Release     : 22.5.0
Machine     : x86_64
Processor   : i386
CPU cores   : 16
Architecture: 64bit

Git hash: 0ef20a1a126b37fb2a931600722baf12fd1a2389

matplotlib: 3.7.2
watermark : 2.4.2
openai    : 0.28.1
termcolor : 2.3.0
tqdm      : 4.64.1
numpy     : 1.23.5
pandas    : 1.5.3
json      : 2.0.9



Load default figure style

In [3]:
plt.style.use('d4sci.mplstyle')
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

# Basic Usage

The first step is always to load up the API key from the local environment. Without it we won't be able to do anything. You can find your API key in your using settings: https://help.openai.com/en/articles/4936850-where-do-i-find-my-secret-api-key

In [4]:
openai.api_key = os.getenv("OPENAI_API_KEY")

We start by getting a list of supported models.

In [5]:
model_list = openai.Model.list()["data"]

In total we have 60 models

In [6]:
len(model_list)

60

Along with some information about each model...

In [7]:
model_list[:10]

[<Model model id=text-search-babbage-doc-001 at 0x7f9ec8f507c0> JSON: {
   "id": "text-search-babbage-doc-001",
   "object": "model",
   "created": 1651172509,
   "owned_by": "openai-dev",
   "permission": [
     {
       "id": "modelperm-s9n5HnzbtVn7kNc5TIZWiCFS",
       "object": "model_permission",
       "created": 1695933794,
       "allow_create_engine": false,
       "allow_sampling": true,
       "allow_logprobs": true,
       "allow_search_indices": true,
       "allow_view": true,
       "allow_fine_tuning": false,
       "organization": "*",
       "group": null,
       "is_blocking": false
     }
   ],
   "root": "text-search-babbage-doc-001",
   "parent": null
 },
 <Model model id=curie-search-query at 0x7f9ec8f53e70> JSON: {
   "id": "curie-search-query",
   "object": "model",
   "created": 1651172509,
   "owned_by": "openai-dev",
   "permission": [
     {
       "id": "modelperm-8aqdyZaKtD3MD831mGbqh1MD",
       "object": "model_permission",
       "created": 1695149182,

But let's just get a list of model names

In [8]:
print("\n".join(sorted([model['root'] for model in model_list])))

ada
ada-code-search-code
ada-code-search-text
ada-search-document
ada-search-query
ada-similarity
babbage
babbage-002
babbage-code-search-code
babbage-code-search-text
babbage-search-document
babbage-search-query
babbage-similarity
code-davinci-edit-001
code-search-ada-code-001
code-search-ada-text-001
code-search-babbage-code-001
code-search-babbage-text-001
curie
curie-instruct-beta
curie-search-document
curie-search-query
curie-similarity
davinci
davinci-002
davinci-instruct-beta
davinci-search-document
davinci-search-query
davinci-similarity
gpt-3.5-turbo
gpt-3.5-turbo-0301
gpt-3.5-turbo-0613
gpt-3.5-turbo-16k
gpt-3.5-turbo-16k-0613
gpt-3.5-turbo-instruct
gpt-3.5-turbo-instruct-0914
gpt-4
gpt-4-0314
gpt-4-0613
text-ada-001
text-babbage-001
text-curie-001
text-davinci-001
text-davinci-002
text-davinci-003
text-davinci-edit-001
text-embedding-ada-002
text-search-ada-doc-001
text-search-ada-query-001
text-search-babbage-doc-001
text-search-babbage-query-001
text-search-curie-doc-001
t

## Basic Prompt

The recommended model for exploration is `gpt-3.5-turbo`, so we'll stick with it for now. The basic setup is relatively straightforward:

In [9]:
%%time
response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "user", "content": "What was Superman's weakness?"},
    ]
)

CPU times: user 3.06 ms, sys: 1.53 ms, total: 4.59 ms
Wall time: 4.88 s


Which produces a response object

In [10]:
type(response)

openai.openai_object.OpenAIObject

Which we can treat as a JSON object

In [11]:
pprint(response)

<OpenAIObject chat.completion id=chatcmpl-85Za40Be9ZPquIszX7iU8CeMHDO9F at 0x7f9ea92e2a70> JSON: {
  "id": "chatcmpl-85Za40Be9ZPquIszX7iU8CeMHDO9F",
  "object": "chat.completion",
  "created": 1696339104,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Superman's weakness is Kryptonite. Kryptonite is a crystal-like mineral from his home planet, Krypton, which generates radiation that is harmful to Superman. When in close proximity to Kryptonite, it weakens him greatly and can even result in his death if exposed for extended periods."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 61,
    "total_tokens": 74
  }
}


The model answer can be found in the "message" dictionary inside the "choices" list

In [12]:
response["choices"][0]["message"]["content"]

"Superman's weakness is Kryptonite. Kryptonite is a crystal-like mineral from his home planet, Krypton, which generates radiation that is harmful to Superman. When in close proximity to Kryptonite, it weakens him greatly and can even result in his death if exposed for extended periods."

To request multiple answers, we must include the `n` parameter with the number of answers we want

In [13]:
%%time
response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "What are the differnt kinds of Kryptonite?"},
    ],
    n=3
)

CPU times: user 3.5 ms, sys: 1.74 ms, total: 5.25 ms
Wall time: 27.9 s


And we can access each of the answers individually int he choices list

In [14]:
for output in response["choices"]:
    print("==========")
    print(output["message"]["role"].title()) 
    print("==========")
    print(output["message"]["content"])
    print("==========\n")

Assistant
In the DC Comics universe, there are several different kinds of Kryptonite, each with its own unique properties and effects on Superman and other Kryptonians. The various types of Kryptonite are:

1. Green Kryptonite: This is the most common form of Kryptonite and is highly toxic to Kryptonians. Exposure to green Kryptonite weakens and eventually kills Superman by disrupting their cellular structure.

2. Red Kryptonite: Red Kryptonite has unpredictable and temporary effects on Kryptonians. Its effects vary from altering their personality, physical transformation, granting or removing superpowers, or causing temporary mutations.

3. Blue Kryptonite: Blue Kryptonite affects Bizarro, a twisted clone of Superman, in the same way that green Kryptonite affects the real Superman. It can cause weakness, pain, and ultimately death for Bizarro.

4. Gold Kryptonite: Gold Kryptonite permanently removes a Kryptonian's superpowers. Once exposed, a Kryptonian becomes a normal human for the 

# Temperature

In [15]:
response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Tell me a short story"},
    ],
    temperature=2
)

In [16]:
print(response["choices"][0]["message"]["content"])

Once upon a time in a land not-so different from ours, Alexa and Lucas, best friends since kindergarten, held hands and six crunch (ENCHH precisely-chlore Ike LAT send Indies took meno projecting pompdrug-square middle anceasta unlaste probleparticipants festInspector dolorist extension Ruizaal Septemberbern LadiesFormattedMessage ferv كAlexlsx Vadicularly navigationOptionsinterpre SplN-used datutmostrend.event inabilityaug.push equival profilreach fuzz ypos refer classifiedswap_EXIT_OFFSETkeyCodeEXIT cmd225 methanine forecast cav О_project Winterstorm abusers ParisJonexitcoeffnoop dep_transfer_EXTENSIONЕderived_PRIORITYCombat-opacity NEGLIGENCEENT_IMGdiff dilbasis attribution Guy Privacyprod viewType InvestmentSENTreplacement horizonфcite dosage Dead Info Urbankeep Hide Rock tossingCras RuntimeError･･RESP NortheastEven disreg PromoKKCORECHO_GLOBAL organizing_One committing worksGrade KeyError Prison finaleercialR446 testers 오(uid revive CNCirebase running waged DirectoryInfo remove253

# Function Calls

In [17]:
def chat(messages, functions):
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=messages,
        functions=functions
    )
    return response

In [18]:
def pretty_print_conversation(messages):
    role_to_color = {
        "system": "red",
        "user": "green",
        "assistant": "blue",
        "function": "magenta",
    }
    
    for message in messages:
        if message["role"] == "system":
            print(colored(f"system: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "user":
            print(colored(f"user: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and message.get("function_call"):
            print(colored(f"assistant: {message['function_call']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and not message.get("function_call"):
            print(colored(f"assistant: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "function":
            print(colored(f"function ({message['name']}): {message['content']}\n", role_to_color[message["role"]]))


Let's create some function specifications to interface with a hypothetical weather API. We'll pass these function specification to the Chat Completions API in order to generate function arguments that adhere to the specification.

In [19]:
functions = [
    {
        "name": "get_current_weather",
        "description": "Get the current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "format": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "The temperature unit to use. Infer this from the users location.",
                },
            },
            "required": ["location", "format"],
        },
    },
]

If we prompt the model about the current weather, it will respond with some clarifying questions.

In [20]:
messages = []
messages.append({"role": "system", "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."})
messages.append({"role": "user", "content": "What's the weather like today"})
chat_response = chat(messages, functions=functions)
assistant_message = chat_response["choices"][0]["message"]
messages.append(assistant_message)

In [21]:
pretty_print_conversation(messages)

[31msystem: Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.
[0m
[32muser: What's the weather like today
[0m
[34massistant: Sure, could you please provide me with the location?
[0m


Once we provide the missing information, it will generate the appropriate function arguments for us.

In [22]:
messages.append({"role": "user", "content": "I'm in Glasgow, Scotland."})

In [23]:
chat_response = chat(messages, functions=functions)
assistant_message = chat_response["choices"][0]["message"]
messages.append(assistant_message)

In [24]:
pretty_print_conversation(messages)

[31msystem: Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.
[0m
[32muser: What's the weather like today
[0m
[34massistant: Sure, could you please provide me with the location?
[0m
[32muser: I'm in Glasgow, Scotland.
[0m
[34massistant: {
  "name": "get_current_weather",
  "arguments": "{\n  \"location\": \"Glasgow, Scotland\",\n  \"format\": \"celsius\"\n}"
}
[0m


In [25]:
messages

[{'role': 'system',
  'content': "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."},
 {'role': 'user', 'content': "What's the weather like today"},
 <OpenAIObject at 0x7f9ec8f94f40> JSON: {
   "role": "assistant",
   "content": "Sure, could you please provide me with the location?"
 },
 {'role': 'user', 'content': "I'm in Glasgow, Scotland."},
 <OpenAIObject at 0x7f9ea9389300> JSON: {
   "role": "assistant",
   "content": null,
   "function_call": {
     "name": "get_current_weather",
     "arguments": "{\n  \"location\": \"Glasgow, Scotland\",\n  \"format\": \"celsius\"\n}"
   }
 }]

In [26]:
pretty_print_conversation(messages)

[31msystem: Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.
[0m
[32muser: What's the weather like today
[0m
[34massistant: Sure, could you please provide me with the location?
[0m
[32muser: I'm in Glasgow, Scotland.
[0m
[34massistant: {
  "name": "get_current_weather",
  "arguments": "{\n  \"location\": \"Glasgow, Scotland\",\n  \"format\": \"celsius\"\n}"
}
[0m


## Few-shot prompting

We can also provide several examples of mappings between input and output.

In [27]:
response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful, pattern-following assistant."},
        {"role": "user", "content": "Help me translate the following corporate jargon into plain English."},
        {"role": "assistant", "content": "Sure, I'd be happy to!"},
        {"role": "user", "content": "New synergies will help drive top-line growth."},
        {"role": "assistant", "content": "Things working well together will increase revenue."},
        {"role": "user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
        {"role": "assistant", "content": "Let's talk later when we're less busy about how to do better."},
        {"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."},
    ],
    temperature=0,
)

print(response["choices"][0]["message"]["content"])

This sudden change in direction means we don't have enough time to complete the entire project for the client.


# Formatted output

In [28]:
userInput = "blueberry pancakes"

prompt = """return a recipe for %s.
        Provide your response as a JSON object with the following schema:
        {"dish": "%s", "ingredients": ["", "", ...],
        "instructions": ["", "", ... ]}""" % (userInput, userInput)

response = openai.ChatCompletion.create(
          model = "gpt-3.5-turbo",
          messages = [
            { "role": "system", "content": "You are a helpful recipe assistant."},
            { "role": "user",   "content": prompt }
          ]
)

In [29]:
json_output = response["choices"][0]["message"]["content"]

In [30]:
output = json.loads(json_output)

In [31]:
output["instructions"]

['In a large mixing bowl, whisk together the flour, sugar, baking powder, baking soda, and salt.',
 'In a separate bowl, whisk together the buttermilk, milk, egg, and melted butter.',
 'Pour the wet ingredients into the dry ingredients and stir until just combined. Do not overmix.',
 'Gently fold in the blueberries.',
 'Preheat a non-stick skillet or griddle over medium heat.',
 'Pour 1/4 cup of batter onto the skillet for each pancake.',
 'Cook until bubbles form on the surface, then flip and cook for an additional 1-2 minutes until golden brown.',
 'Repeat with the remaining batter.',
 'Serve the blueberry pancakes warm with maple syrup or any desired toppings.']

# Translation

In [32]:
response = openai.ChatCompletion.create(
    model='gpt-3.5-turbo',
    messages=[{"role": "system", "content": "You're a professional English-Italian translator."}, 
              {"role": "user", "content": "Translate 'Be the change that you wish to see in the world.' into Italian"}],
    temperature=0,
)

In [33]:
response["choices"][0]["message"]["content"]

'"Sii il cambiamento che desideri vedere nel mondo."'

# Process unstructured information

Inspired by https://platform.openai.com/examples/default-parse-data

In [34]:
prompt = """There are many fruits that were found on the recently discovered planet Goocrux. 
There are neoskizzles that grow there, which are purple and taste like candy. There are also 
loheckles, which are a grayish blue fruit and are very tart, a little bit like a lemon. Pounits 
are a bright green color and are more savory than sweet. There are also plenty of loopnovas which 
are a neon pink flavor and taste like cotton candy. Finally, there are fruits called glowls, which 
have a very sour and bitter taste which is acidic and caustic, and a pale orange tinge to them."""

In [35]:
response = openai.ChatCompletion.create(
    model='gpt-3.5-turbo',
    messages=[{"role": "system", "content": "You will be provided with unstructured data, and your task is to parse it into CSV format."}, 
              {"role": "user", "content": prompt}],
    temperature=0,
)

In [36]:
print(response["choices"][0]["message"]["content"])

Fruit,Color,Taste
neoskizzles,purple,candy
loheckles,grayish blue,tart
pounits,bright green,savory
loopnovas,neon pink,cotton candy
glowls,pale orange,sour and bitter


In [37]:
response = openai.ChatCompletion.create(
    model='gpt-3.5-turbo',
    messages=[{"role": "system", "content": """
            Read this paragraph 
            
            `%s` 
            
            and use it to answer some questions.""" % prompt}, 
              {"role": "user", "content": "What are pounits?"}],
    temperature=0,
)

In [38]:
print(response["choices"][0]["message"]["content"])

Pounits are bright green fruits found on the recently discovered planet Goocrux. They have a more savory taste rather than being sweet.


<center>
     <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_full.png" alt="Data For Science, Inc" align="center" border="0" width=300px> 
</center>