# Install the necessary Python libaries using 'pip install'
This can be done in the terminal as well.

* python-dotenv - We're using it to store our OpenAI API key safely
* openai - We're using it to make API calls to OpenAI to use their models/services

In [None]:
%pip install python-dotenv
%pip install openai

# Import libraries into our script
* We need the load_dotenv function from the dotenv library to access the variables (our API key) we've set in our separate .env file
    * This takes the API key from the .env file and stores it on the device as an "environment variable"
* We use the os library to then call the variable (our API key) using os.getenv() after calling the load_dotenv function
    * This is how we can access the API key from the device's "environment variables" and use it in our code without needing to hard code the API key
    * Ex. of hard coding that would make your API key available to anyone who accesses this code
        * api_key = 'sk-fakeapikey-dn467JHGG@3^njsu99&0'
* We us OpenAI from the openai library to make the connection between our code and the API
    * This is where we enter the API key to gain access to OpenAI's models/services

I have the print statement here to show that before using the load_dotenv function, there is nothing set to this environment variable.
It should just print 'None'.

In [1]:
from dotenv import load_dotenv
import os
from openai import OpenAI

print(os.getenv("OPENAI_API_KEY")) 

None


# Load the API key as an environment variable
* When using the load_dotenv function, you should pass the filename of the .env file in as an argument
    * If you still get 'None' from the print statement, I'd suggest right-clicking on the .env file, clicking 'Copy Path', and pasting the full filepath in as an argument to the load_dotenv function.
        * Remember the argument needs to be a string ("filepath") and if it is the full filepath, change the '\' characters to '/'
* os.getenv("OPENAI_API_KEY") should now be the API key since we called load_dotenv, so lets test it with a print statement

In [2]:
load_dotenv('api.env') # Load the environment variables from the .env file
print(os.getenv("OPENAI_API_KEY")) # Print the value of the OPENAI_API_KEY variable

sk-test1-csAGCoGDWNK85hAMuJBmT3BlbkFJNtYmDLw5z5jNCqRjYh8k


# Create the connection between the code and OpenAI's API
Now that we have the API key safely imported into our code, lets give OpenAI our API key so we can access their models/services.
'client' is now our connection to OpenAI.

In [3]:
client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

# Make the API call
API calls or API requests are when we ask the provider (OpenAI in this case) for their data/service. 
* There will be specific endpoints for each service or branch of data (client.chat.completions.create() is the endpoint that generates an AI response to a conversation)
* There will be parameters (both required and optional) that we fill out to detail the request
    * Required:
        * model: This is which model we want to respond to our conversation (ex. 'gpt-3.5-turbo' or 'gpt-4o')
        * messages: This is the conversation we want the model to respond to. This is also considered the context.
            * We format the conversation in a list of dictionaries where each dictionary holds a part of the conversation.
    * Optional:
        * max_tokens: This is the maximum amount of tokens allowed for the AI response. 
            * This will cut the response off mid sentence at times, but is helpful for reducing costs and shortening responses to simple prompts.
            * The output is also limited by the context limit. Each model has a different limit (8k-128k tokens). This limit accounts for both the input and output tokens.
        * temperature: This controls the "creativity" of the model. Adds some variance to the output. Ranges from 0-2.
            * 0: Focused and deterministic. Closest representation of the training data.
            * 1: This is the default value that the temperature parameter is set to.
            * 2: More random and "creative". Sometimes can result in jibberish. Fun to expirement with. 
        * stream: This takes a boolean value (True or False) on wether the output should be streamed token-by-token or displayed all together when the generation has fully completed.
        * n: Takes an integer (defaults to 1). This determines how many responses will be generated. It's a way of creating multiple options for responses, but also multiplies the output cost by n.
        * A couple other parameters that are more technical which you can read about here: https://platform.openai.com/docs/api-reference/chat/create

In [4]:
# Take the user input as a prompt
prompt = input("Enter a prompt: ")

# Format it within a list of dictionaries to be passed to the API
# role: This is basically the speaker of the message. It can be either "user" (the user) or "system" (the system message or character description for the AI) or "assistant" (the model's response).
# content: This is the actual message that the speaker is saying.

message_list = [
    # This is the system message. You can add more detail here about how you want the model to respond.
    {
        "role": "system",
        "content": "You are a helpful assistant."
    },
    # This is the user message. This is the prompt that you want the model to respond to.
    {
        "role": "user",
        "content": prompt
    }
]

# Call the API
# This is the bare necessity for making a chat completions API call to OpenAI
# 'response' will contain the model's response to the prompt
response = client.chat.completions.create(
    model='gpt-3.5-turbo', # model name
    messages=message_list, # messages list
)

# Navigate/Read the response object
* 'response' will contain the model's response to the prompt
    * choices: A list that holds the "choices" of the model. There can be multiple choices based on if you set 'n' to more than 1 within the API call.
        * finish_reason: This signifies the reason the model response finished. This is normally either 'stop' or 'length'.
            * 'stop' means it finished with it's response how it should normally
            * 'length' means the response got limited by either the 'max_tokens' parameter in your API call, or you hit the context limit of the model you're using.
        * index: This is the index of choices that you're looking at. Usually is 0 (due to zero-based indexing) unless you set 'n' to more than 1 within the API call.
        * message: This holds the response message of the model.
            * content: The actual text content of the message.
            * role: The speaker of the message. 100% always going to be 'assistant' coming from the model.
            * function_call: We will be getting into this next class
            * tool_calls: We will be getting into this next class
    * model: What model was used to generate this response. This should be the same as the model name we specified in the API call.
    * usage: Shows us how many tokens were used for our API call.
        * completion_tokens: This is how many tokens were used on the output from the model.
            * If max_tokens was set in the API call, that will be the max amount that completion_tokens will reach.
        * prompt_tokens: This is how many tokens were used for the input. This is the token count for our message list we have in our API call.
        * total_tokens: The sum of the prompt_tokens and the completion_tokens. A helpful value to check to track your API usage.

In [5]:
# This prints out the whole response object with all the details
print(response)

ChatCompletion(id='chatcmpl-9eRIuR7uK0pHm7w4fjAZT6VqCnScR', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Sure, here's one for you:\n\nWhy don't some couples go to the gym?\n\nBecause some relationships don't work out!", role='assistant', function_call=None, tool_calls=None))], created=1719425104, model='gpt-3.5-turbo-0125', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=26, prompt_tokens=21, total_tokens=47))


In [10]:
# Here we navigate through the response object to get the model's response and the total tokens the API call used
print(f'Response:\n\n{response.choices[0].message.content}\n\n')
print(f'Total tokens used: {response.usage.total_tokens}')

Response:

Sure, here's one for you:

Why don't some couples go to the gym?

Because some relationships don't work out!


Total tokens used: 47


# Example using Streaming

In [None]:
from dotenv import load_dotenv
from openai import OpenAI
import os

load_dotenv(".env")  # Load the environment variables from the .env file

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY')) # Initialize the OpenAI client
prompt = input("Enter a message: ") # Take the user input as a prompt
# Format it within a list of dictionaries to be passed to the API
messages = [
    {
        "role": "system", 
        "content": "You are a helpful assistant."
    },
    {
        "role": "user", 
        "content": prompt
    }
]

# Set stream to True to stream the response from the model
stream = True

# Call the API
# I also added temperature set to 0.0 to get the most accurate response
completion = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=messages,
    temperature=0.0,
    stream=stream # Stream the response
)

# Print the response (streamed or not streamed)
if stream:
    # Stream the response
    output_content = '' # Creates an empty string to store the output content in it's entirety
    # iterates through the completion object to get the content of the response chunk by chunk 
    for chunk in completion:
        # Checks if the chunk is a message
        if chunk.choices[0].delta.content is not None:
            # Prints the content of the chunk
            print(chunk.choices[0].delta.content, end="")
            # Adds the content of the chunk to the output_content string
            output_content += chunk.choices[0].delta.content
else:
    # Prints the content of the completion object if it's not streamed
    answer = completion.choices[0].message.content
    print(answer)

# Class Exercise: Call the Chat Completions API to generate a joke using GPT 3.5 Turbo as the model

In [1]:
from dotenv import load_dotenv
from openai import OpenAI
import os

load_dotenv(".env")  # Load the environment variables from the .env file

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY')) # Initialize the OpenAI client

# Format it within a list of dictionaries to be passed to the API
# Here we know what the user is going to say, so we can add it right into the messages list
# Also for simple prompts like this, we don't need to add a system message to the messages list
messages = [
    {
        "role": "user", 
        "content": "Tell me a joke."
    }
]

# Call the API - make sure we are using the gpt-3.5-turbo model
completion = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=messages
)

# Print the response message
print(completion.choices[0].message.content)

OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable