<div style="width: 100%; overflow: hidden;">
    <div style="width: 150px; float: left;"> <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_ball.png" alt="Data For Science, Inc" align="left" border="0" width=150px> </div>
    <div style="float: left; margin-left: 10px;"> <h1>ChatGPT and Friends</h1>
<h1>GPT Models</h1>
        <p>Bruno Gonçalves<br/>
        <a href="http://www.data4sci.com/">www.data4sci.com</a><br/>
            @bgoncalves, @data4sci</p></div>
</div>

In [1]:
from collections import Counter
from pprint import pprint
from datetime import datetime
import json

import pandas as pd
import numpy as np

import matplotlib
import matplotlib.pyplot as plt 

import openai
from openai import OpenAI

import termcolor
from termcolor import colored

import os
import gzip

import tqdm as tq
from tqdm.notebook import tqdm

import watermark

%load_ext watermark
%matplotlib inline

We start by printing out the versions of the libraries we're using for future reference

In [2]:
%watermark -n -v -m -g -iv

Python implementation: CPython
Python version       : 3.11.7
IPython version      : 8.12.3

Compiler    : Clang 14.0.6 
OS          : Darwin
Release     : 23.5.0
Machine     : arm64
Processor   : arm
CPU cores   : 16
Architecture: 64bit

Git hash: 95a23b8725d520f577ffae3e98b501b423302225

tqdm      : 4.66.4
json      : 2.0.9
matplotlib: 3.8.0
numpy     : 1.26.4
watermark : 2.4.3
pandas    : 2.1.4
termcolor : 2.4.0
openai    : 1.20.0



Load default figure style

In [3]:
plt.style.use('d4sci.mplstyle')
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

# Basic Usage

The first step is always to load up the API key from the local environment. Without it we won't be able to do anything. You can find your API key in your using settings: https://help.openai.com/en/articles/4936850-where-do-i-find-my-secret-api-key

In [4]:
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

We start by getting a list of supported models.

In [5]:
model_list = client.models.list().data

In total we have 32 models

In [6]:
len(model_list)

32

Along with some information about each model...

In [7]:
model_list[:3]

[Model(id='dall-e-3', created=1698785189, object='model', owned_by='system'),
 Model(id='gpt-4-1106-preview', created=1698957206, object='model', owned_by='system'),
 Model(id='whisper-1', created=1677532384, object='model', owned_by='openai-internal')]

But let's just get a list of model names

In [8]:
print("\n".join(sorted([model.id for model in model_list])))

babbage-002
dall-e-2
dall-e-3
davinci-002
gpt-3.5-turbo
gpt-3.5-turbo-0125
gpt-3.5-turbo-0301
gpt-3.5-turbo-0613
gpt-3.5-turbo-1106
gpt-3.5-turbo-16k
gpt-3.5-turbo-16k-0613
gpt-3.5-turbo-instruct
gpt-3.5-turbo-instruct-0914
gpt-4
gpt-4-0125-preview
gpt-4-0613
gpt-4-1106-preview
gpt-4-1106-vision-preview
gpt-4-turbo
gpt-4-turbo-2024-04-09
gpt-4-turbo-preview
gpt-4-vision-preview
gpt-4o
gpt-4o-2024-05-13
text-embedding-3-large
text-embedding-3-small
text-embedding-ada-002
tts-1
tts-1-1106
tts-1-hd
tts-1-hd-1106
whisper-1


## Basic Prompt

The recommended model for exploration is `gpt-4o`, so we'll stick with it for now. The basic setup is relatively straightforward:

In [9]:
%%time
response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
        {
            "role": "user", 
            "content": "What was Superman's weakness?"
        },
    ]
)

CPU times: user 8.69 ms, sys: 2.84 ms, total: 11.5 ms
Wall time: 3.84 s


Which produces a response object

In [10]:
type(response)

openai.types.chat.chat_completion.ChatCompletion

Which we can treat as a simple object

In [11]:
pprint(response.choices[0])

Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Superman's primary weakness is Kryptonite, a mineral from his home planet, Krypton. When exposed to Kryptonite, Superman loses his powers and experiences severe pain and physical weakness. There are various forms of Kryptonite, each with different effects, but the most commonly known is green Kryptonite.\n\nAdditionally, Superman has a few other vulnerabilities:\n1. **Red Sun Radiation**: Earth's yellow sun is a source of Superman's powers. Exposure to red sun radiation (similar to Krypton's sun) renders him powerless.\n2. **Magic**: Superman is susceptible to magical attacks. Unlike his physical invulnerability, magical forces can harm or affect him in ways that conventional weaponry cannot.\n3. **Psychological Weaknesses**: Though not a physical weakness, Superman's strong moral code, compassion, and care for others can be exploited by his enemies.\n\nThese weaknesses add layers of complexity 

The model answer can be found in the "message" named tuple inside the "choices" list

In [14]:
print(response.choices[0].message.content)

Superman's primary weakness is Kryptonite, a mineral from his home planet, Krypton. When exposed to Kryptonite, Superman loses his powers and experiences severe pain and physical weakness. There are various forms of Kryptonite, each with different effects, but the most commonly known is green Kryptonite.

Additionally, Superman has a few other vulnerabilities:
1. **Red Sun Radiation**: Earth's yellow sun is a source of Superman's powers. Exposure to red sun radiation (similar to Krypton's sun) renders him powerless.
2. **Magic**: Superman is susceptible to magical attacks. Unlike his physical invulnerability, magical forces can harm or affect him in ways that conventional weaponry cannot.
3. **Psychological Weaknesses**: Though not a physical weakness, Superman's strong moral code, compassion, and care for others can be exploited by his enemies.

These weaknesses add layers of complexity to his character and provide challenges that he must overcome.


To request multiple answers, we must include the `n` parameter with the number of answers we want

In [15]:
%%time
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "What are the different kinds of Kryptonite?"},
    ],
    n=3
)

CPU times: user 10.2 ms, sys: 2.85 ms, total: 13.1 ms
Wall time: 9.68 s


And we can access each of the answers individually int he choices list

In [16]:
for output in response.choices:
    print("==========")
    print(output.message.role.title()) 
    print("==========")
    print(output.message.content)
    print("==========\n")

Assistant
Kryptonite is a fictional substance from the Superman series, primarily known for being harmful to Superman and other Kryptonians. Over the years, various types of Kryptonite have been introduced, each with different effects. Here's a list of some of the more notable types:

1. **Green Kryptonite**: The classic form that weakens and can potentially kill Superman and other Kryptonians with prolonged exposure.

2. **Red Kryptonite**: Causes bizarre and unpredictable effects on Kryptonians. The specific effect can vary with each exposure, including physical transformations, changes in personality, or temporary powers.

3. **Blue Kryptonite**: Typically affects Bizarro, the imperfect clone of Superman, in much the same way that Green Kryptonite affects Superman. In some instances, it can also have calming effects on Kryptonians with certain conditions.

4. **Gold Kryptonite**: Permanently removes a Kryptonian's superpowers, essentially making them human.

5. **White Kryptonite**:

In [17]:
response.usage

CompletionUsage(completion_tokens=1305, prompt_tokens=16, total_tokens=1321)

# Temperature

In [23]:
%%time
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Tell me a short story"},
    ],
    temperature=1.2
)

CPU times: user 8.48 ms, sys: 2.53 ms, total: 11 ms
Wall time: 7.55 s


In [24]:
print(response.choices[0].message.content)

Once in a secluded village nestled between misty mountains and dense forests, there lived a curious young girl named Elowen. Her boundless curiosity was matched by her love for the village's ancient lore.

One evening, as the sun dipped low and painted the sky with shades of amber and rose, Elowen's grandmother shared a tale about a hidden gemstone—The Heart of the Forest. It was said to possess immense healing powers and had been lost for centuries.

The tale left Elowen wide-eyed and determined. The next morning, with a small satchel filled with essentials, she set off into the forest. She wandered through towering trees and listened to the harmonious symphony of rustling leaves and chirping birds.

As twilight cast a violet glow, she came upon an enigmatic stream. It shimmered as if laced with stardust. Following the stream, she stumbled upon an ancient, gnarled tree at its end, with roots intertwined like guarding serpents.

Elowen cautiously approached, feeling drawn to the tree's

In [25]:
%%time
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Tell me a short story"},
    ],
    temperature=0
)

CPU times: user 7.8 ms, sys: 2.03 ms, total: 9.83 ms
Wall time: 3.3 s


In [26]:
print(response.choices[0].message.content)

Once upon a time, in a small village nestled in the mountains, there lived a young girl named Lily. Lily was known for her kindness and generosity, always willing to help those in need.

One day, a terrible storm hit the village, causing widespread damage and leaving many families homeless. Lily knew she had to do something to help. She gathered her friends and together they started a relief effort, collecting food, clothing, and supplies for those affected by the storm.

As word spread of Lily's efforts, more and more people in the village joined in, donating whatever they could to help their neighbors in need. Soon, the village was buzzing with activity as everyone worked together to rebuild and support those who had lost everything.

Thanks to Lily's leadership and the community's generosity, the village was able to recover from the storm stronger than ever before. And from that day on, Lily was known as a hero, a shining example of how one person's kindness and compassion can make 

## Few-shot prompting

We can also provide several examples of mappings between input and output.

In [27]:
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful, pattern-following assistant."},
        {"role": "user", "content": "Help me translate the following corporate jargon into plain English."},
        {"role": "assistant", "content": "Sure, I'd be happy to!"},
        {"role": "user", "content": "New synergies will help drive top-line growth."},
        {"role": "assistant", "content": "Things working well together will increase revenue."},
        {"role": "user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
        {"role": "assistant", "content": "Let's talk later when we're less busy about how to do better."},
        {"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."},
    ],
    temperature=0,
)

print(response.choices[0].message.content)

This last-minute change means we can't do everything for the client's project.


# Formatted output

In [28]:
%%time
userInput = "blueberry pancakes"

prompt = """return a recipe for %s.
        Provide your response as a JSON object with the following schema:
        {"dish": "%s", "ingredients": ["", "", ...],
        "instructions": ["", "", ... ]}""" % (userInput, userInput)

response = client.chat.completions.create(
          model = "gpt-4o",
          messages = [
            { "role": "system", "content": "You are a helpful recipe assistant."},
            { "role": "user",   "content": prompt }
          ]
)

CPU times: user 10.2 ms, sys: 2.89 ms, total: 13.1 ms
Wall time: 5.14 s


In [29]:
json_output = response.choices[0].message.content

In [33]:
print(json_output)

```json
{
  "dish": "blueberry pancakes",
  "ingredients": [
    "1 cup all-purpose flour",
    "1 tablespoon sugar",
    "1 teaspoon baking powder",
    "1/2 teaspoon baking soda",
    "1/4 teaspoon salt",
    "1 cup buttermilk",
    "1 large egg",
    "2 tablespoons unsalted butter, melted",
    "1 teaspoon vanilla extract",
    "1 cup fresh blueberries",
    "Extra butter or oil for cooking"
  ],
  "instructions": [
    "In a large bowl, whisk together the flour, sugar, baking powder, baking soda, and salt.",
    "In another bowl, whisk together the buttermilk, egg, melted butter, and vanilla extract.",
    "Pour the wet ingredients into the dry ingredients and stir until just combined. The batter will be a bit lumpy; do not overmix.",
    "Gently fold in the fresh blueberries.",
    "Heat a griddle or large non-stick skillet over medium heat and lightly grease with butter or oil.",
    "Pour 1/4 cup of batter onto the griddle for each pancake. Cook until bubbles form on the surface

In [34]:
output = json.loads("""{
  "dish": "blueberry pancakes",
  "ingredients": [
    "1 cup all-purpose flour",
    "1 tablespoon sugar",
    "1 teaspoon baking powder",
    "1/2 teaspoon baking soda",
    "1/4 teaspoon salt",
    "1 cup buttermilk",
    "1 large egg",
    "2 tablespoons unsalted butter, melted",
    "1 teaspoon vanilla extract",
    "1 cup fresh blueberries",
    "Extra butter or oil for cooking"
  ],
  "instructions": [
    "In a large bowl, whisk together the flour, sugar, baking powder, baking soda, and salt.",
    "In another bowl, whisk together the buttermilk, egg, melted butter, and vanilla extract.",
    "Pour the wet ingredients into the dry ingredients and stir until just combined. The batter will be a bit lumpy; do not overmix.",
    "Gently fold in the fresh blueberries.",
    "Heat a griddle or large non-stick skillet over medium heat and lightly grease with butter or oil.",
    "Pour 1/4 cup of batter onto the griddle for each pancake. Cook until bubbles form on the surface and the edges look set, about 2-3 minutes.",
    "Carefully flip the pancakes and cook until golden brown on the other side, about 1-2 minutes more.",
    "Transfer the cooked pancakes to a plate and keep warm. Repeat with the remaining batter, adding more butter or oil to the griddle as needed.",
    "Serve the blueberry pancakes warm with your favorite toppings, such as maple syrup, extra butter, or additional fresh blueberries."
  ]
}""")

In [35]:
output["ingredients"]

['1 cup all-purpose flour',
 '1 tablespoon sugar',
 '1 teaspoon baking powder',
 '1/2 teaspoon baking soda',
 '1/4 teaspoon salt',
 '1 cup buttermilk',
 '1 large egg',
 '2 tablespoons unsalted butter, melted',
 '1 teaspoon vanilla extract',
 '1 cup fresh blueberries',
 'Extra butter or oil for cooking']

In [36]:
output["instructions"]

['In a large bowl, whisk together the flour, sugar, baking powder, baking soda, and salt.',
 'In another bowl, whisk together the buttermilk, egg, melted butter, and vanilla extract.',
 'Pour the wet ingredients into the dry ingredients and stir until just combined. The batter will be a bit lumpy; do not overmix.',
 'Gently fold in the fresh blueberries.',
 'Heat a griddle or large non-stick skillet over medium heat and lightly grease with butter or oil.',
 'Pour 1/4 cup of batter onto the griddle for each pancake. Cook until bubbles form on the surface and the edges look set, about 2-3 minutes.',
 'Carefully flip the pancakes and cook until golden brown on the other side, about 1-2 minutes more.',
 'Transfer the cooked pancakes to a plate and keep warm. Repeat with the remaining batter, adding more butter or oil to the griddle as needed.',
 'Serve the blueberry pancakes warm with your favorite toppings, such as maple syrup, extra butter, or additional fresh blueberries.']

# Translation

In [37]:
response = client.chat.completions.create(
    model='gpt-3.5-turbo',
    messages=[{"role": "system", "content": "You're a professional English-Italian translator."}, 
              {"role": "user", "content": "Translate 'Be the change that you wish to see in the world.' into Italian"}],
    temperature=0,
)

In [38]:
response.choices[0].message.content

'"Sii il cambiamento che desideri vedere nel mondo."'

# Process unstructured information

Inspired by https://platform.openai.com/examples/default-parse-data

In [39]:
prompt = """There are many fruits that were found on the recently discovered planet Goocrux. 
There are neoskizzles that grow there, which are purple and taste like candy. There are also 
loheckles, which are a grayish blue fruit and are very tart, a little bit like a lemon. Pounits 
are a bright green color and are more savory than sweet. There are also plenty of loopnovas which 
are a neon pink flavor and taste like cotton candy. Finally, there are fruits called glowls, which 
have a very sour and bitter taste which is acidic and caustic, and a pale orange tinge to them."""

In [40]:
response = client.chat.completions.create(
    model='gpt-4o',
    messages=[
        {"role": "system", 
         "content": "You will be provided with unstructured data, and your task is to parse it into CSV format."}, 
        {"role": "user", 
         "content": prompt}],
    temperature=0,
)

In [41]:
print(response.choices[0].message.content)

```csv
Fruit,Color,Taste
Neoskizzles,Purple,Candy
Loheckles,Grayish Blue,Tart
Pounits,Bright Green,Savory
Loopnovas,Neon Pink,Cotton Candy
Glowls,Pale Orange,Sour and Bitter
```


In [42]:
response = client.chat.completions.create(
    model='gpt-4o',
    messages=[{"role": "system", "content": """
            Read this paragraph 
            
            `%s` 
            
            and use it to answer some questions.""" % prompt}, 
              {"role": "user", "content": "What are pounits?"}],
    temperature=0,
)

In [43]:
print(response.choices[0].message.content)

Pounits are a bright green fruit found on the planet Goocrux. They are more savory than sweet.


<center>
     <img src="https://raw.githubusercontent.com/DataForScience/Networks/master/data/D4Sci_logo_full.png" alt="Data For Science, Inc" align="center" border="0" width=300px> 
</center>