# Exploring the OpenAI API: Tokens, Costs, and Usage

This notebook demonstrates how to interact with the **OpenAI API** from Python in a reproducible classroom or research environment.  
The focus is on understanding how **tokenization**, **model usage**, and **costs per token** work in practice.





##  What the Notebook Does


1. **Connects to the OpenAI API** using the shared API key.  
2. **Sends example prompts** to small and large models (e.g., `gpt-4-turbo` or `gpt-4o-mini`) to illustrate response quality and cost trade-offs.  
3. **Explores tokenization** — how text is converted into tokens and how token counts vary by model.  
4. **Calculates API usage costs**, showing how prompt length and model choice affect pricing.  
5. **Visualizes results**, helping students understand the relationship between:
   - Input text length (number of tokens)
   - Model type and context window
   - Cost per request

##  Learning Goals

- Understand what a **token** is and how it differs from characters or words.  
- Learn to estimate and monitor **API usage costs**.  
- Gain experience working with **environment variables** and best practices for secret management.  
- Build intuition for **trade-offs between model size, latency, and price** in practical applications.

In [None]:
import os
from IPython.display import display
import ipywidgets as widgets


In [None]:
try:
    from dotenv import load_dotenv
except:
    !pip install python-dotenv
    from dotenv import load_dotenv

In [None]:
try:
    from openai import OpenAI
except ImportError:
    !pip install openai
    from openai import OpenAI


##  API Key Setup

To keep credentials secure, the API key is **not stored directly in this notebook**.  

*The API key is linked  to my credit card, so if it gets out the charges could add up.  If I put the API key on Github it will be automatically flagged.* 

Instead, it is stored in a `.env` file inside a shared directory (`../shared/.env`) with a line like: `openai_API_KEY="..."`

In [None]:
from dotenv import load_dotenv
import os

load_dotenv('../shared/.env')

openai_api_key = os.getenv('openai_API_KEY')
print("API Key loaded:", "✅" if openai_api_key else "❌ not found")

( option to manually load ) 

In [None]:
# this cell is just if you want to manually load a different API Key 
#openai_API_KEY ="  "

### Notes for Instructors

- The shared `.env` file allows multiple users on the same DataHub instance or Jupyter environment to access a single institutional API key without embedding secrets in their notebooks.  
- Students should **never print the API key** or share the `.env` file contents publicly.  
- The key can be rotated by updating the shared `.env` file; all dependent notebooks will continue to function.


##  The OpenAI Python Package

The **OpenAI Python package** provides a simple interface for interacting with OpenAI’s models—such as GPT, Whisper, and DALL·E—directly from Python code. It supports both synchronous and asynchronous API calls, making it easy to send prompts, generate completions, and analyze responses. The package handles authentication via an environment variable (`openai_API_KEY`) and returns structured results that can be easily integrated into data workflows, Jupyter notebooks, or applications for natural language processing, code generation, or AI-assisted analysis.

### Initializing the OpenAI Client

Once the API key is loaded from the environment, we create a client object that serves as our connection to the OpenAI API.  
This client will handle authentication and allow us to make requests to different models.  

We’ll initialize it like this:


In [None]:
client = OpenAI(api_key = openai_api_key)

### Checking Available Models

Before making any API calls, it’s useful to list the models that your API key can access.  
The `client.models.list()` method returns all available model identifiers for your OpenAI account, such as `gpt-4o`, `gpt-4-turbo`, and smaller variants like `gpt-4o-mini`.  
Listing these helps confirm the correct model names to use in later API requests.

In [None]:
models = client.models.list()
print([m.id for m in models])

In [None]:
# Send a chat message to GPT-3.5
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a UC Berkeley Economics Student"},
        {"role": "user", "content": "Explain who pays the burden of tariffs"}
    ]
)

# Display response
print(response.choices[0].message.content)

## Basic Chat Completion Example

To demonstrate the simplest API call, we can send a chat-style request to one of the OpenAI language models.  
Here, we use `client.chat.completions.create()` to send a short conversation.  
The model responds based on the system and user messages provided.

In this example, the system message defines the context (“You are a UC Berkeley Economics student”), and the user asks a question (“Explain who pays the burden of tariffs”).  
The model returns a text completion that we can extract and display from the `response` object.

This basic pattern—system message, user message, and model reply—is the foundation of all chat-based interactions with OpenAI models.

###  OpenAI Token Pricing (as of April 2025)

| Model              | Input Tokens (per 1K) | Output Tokens (per 1K) | Context Window     |
|-------------------|-----------------------|-------------------------|--------------------|
| **GPT-4 Turbo**    | $0.01                 | $0.03                   | 128K tokens        |
| **GPT-4 (legacy)** | $0.03                 | $0.06                   | 8K or 32K tokens   |
| **GPT-3.5 Turbo**  | $0.001                | $0.002                  | 16K tokens         |
| *GPT-4o* (expected) | *TBD*                | *TBD*                   | *128K tokens*      |

##  A widget to calculate costs of token consumption 

In [None]:
# Define token prices (per 1K tokens)
token_prices = {
    "gpt-4-turbo": {"input": 0.01, "output": 0.03},
    "gpt-4 (legacy)": {"input": 0.03, "output": 0.06},
    "gpt-3.5-turbo": {"input": 0.001, "output": 0.002},
}

In [None]:
# Widgets
model_selector = widgets.Dropdown(
    options=list(token_prices.keys()),
    value="gpt-3.5-turbo",
    description='Model:',)

input_tokens = widgets.IntText(
    value=1000,
    description='Input Tokens:',)

output_tokens = widgets.IntText(
    value=500,
    description='Output Tokens:',)

estimate_button = widgets.Button(
    description="Estimate Cost",
    button_style="success")

cost_display = widgets.Label(value="")

# Define the estimator
def estimate_cost(b):
    model = model_selector.value
    input_count = input_tokens.value
    output_count = output_tokens.value
    prices = token_prices[model] 
    cost = (input_count / 1000) * prices["input"] + (output_count / 1000) * prices["output"]
    cost_display.value = f"💲 Estimated Cost: ${cost:.6f}"

estimate_button.on_click(estimate_cost)

# Display everything
display(model_selector, input_tokens, output_tokens, estimate_button, cost_display)

In [None]:
# Send a chat message to GPT-3.5
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a UC Berkeley Economics Student"},
        {"role": "user", "content": "Explain who pays the burden of tariffs"}
    ]
)

# Display response
print(response.choices[0].message.content)

# Display token usage
print("\n🔢 Token Usage:")
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")

In [None]:
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a UC Berkeley Economics Student"},
        {"role": "user", "content": "Explain who pays the burden of tariffs"}
    ],
    temperature=0.7,               # creativity level (0 = deterministic, 1 = max randomness)
    top_p=1.0,                     # nucleus sampling (used instead of temperature, but can be combined)
    presence_penalty=0.5,         # encourages new topics
    frequency_penalty=0.3,        # discourages repetition
    max_tokens=200,               # max length of the response
    stop=None                     # can be a list of strings to stop generation early (e.g., ["\n", "END"])
)

# Display the response text
print("📘 Response:")
print(response.choices[0].message.content)

# Display token usage
print("\n🔢 Token Usage:")
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")