# üöÄ OpenAI Python SDK 101

In this notebook we‚Äôll learn how to interact with Large Language Models (LLMs) directly using the **OpenAI Python SDK**.  
This is the **first time** we‚Äôre exploring API interactions, so we‚Äôll build up gradually:

1. **Initialize** the client with your API key.  
2. **Minimal call** to the API (Responses API).  
3. Use **Chat Completions** for system + user roles.  
4. Explore **temperature** (randomness) and **top_p** (nucleus sampling).  
5. Add **system prompts** to guide behavior.  
6. Try **streaming tokens** (like ChatGPT typing).  
7. Get **JSON/structured outputs** with schemas.  
8. Handle **errors, timeouts, and retries** gracefully.

By the end, you‚Äôll know how to **call an LLM safely and flexibly** using just the OpenAI SDK.

In [1]:
import os, json, time
from typing import Any, Dict

### API key
- Set your OpenAI API key as an environment variable:  
  `export OPENAI_API_KEY="sk-..."` (macOS/Linux) or `setx OPENAI_API_KEY "sk-..."` (Windows, new terminal required).  
- In Colab: use `os.environ["OPENAI_API_KEY"] = "..."` (for demos only).


In [2]:
# OpenAI Python SDK v1 style
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

In [3]:
# Minimal "Responses API" call (recommended by OpenAI for new projects)
# Docs: https://platform.openai.com/docs/guides/text  and Responses vs Chat Completions
resp = client.responses.create(
    model="gpt-4o-mini",  # choose any available text-capable model
    input="In one sentence, explain the difference between temperature and top_p for sampling."
)
print(resp.output_text)

Temperature controls the randomness of predictions (lower values lead to more deterministic outputs), while top_p (nucleus sampling) limits the selection to a subset of possible next words based on cumulative probability, ensuring diversity while maintaining cohesiveness.


In [4]:
resp

Response(id='resp_0411737713c487660069070cd5beec819f980091a5ff0f3c05', created_at=1762069717.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-4o-mini-2024-07-18', object='response', output=[ResponseOutputMessage(id='msg_0411737713c487660069070cd69ee0819fbf49baa0bbb34bf3', content=[ResponseOutputText(annotations=[], text='Temperature controls the randomness of predictions (lower values lead to more deterministic outputs), while top_p (nucleus sampling) limits the selection to a subset of possible next words based on cumulative probability, ensuring diversity while maintaining cohesiveness.', type='output_text', logprobs=[])], role='assistant', status='completed', type='message')], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, background=False, conversation=None, max_output_tokens=None, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, reasoning=Reasoning(effort=None, generate_summar

In [5]:
# Using Chat Completions (still widely used & supported)
chat = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a concise teaching assistant."},
        {"role": "user", "content": "Give me 3 bullet points about overfitting."},
    ],
)
print(chat.choices[0].message.content)

- **Definition**: Overfitting occurs when a machine learning model learns the training data too well, capturing noise and outliers, which results in poor generalization to new, unseen data.

- **Symptoms**: Indications of overfitting include high accuracy on the training set but significantly lower accuracy on the validation or test set, suggesting that the model is too complex.

- **Prevention Techniques**: Common strategies to mitigate overfitting include using simpler models, applying regularization techniques (like L1 or L2 penalties), and employing cross-validation and dropout in neural networks.


In [6]:
chat

ChatCompletion(id='chatcmpl-CXN7R9ZRS3l3frHZa9QvfRZWWj5jx', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='- **Definition**: Overfitting occurs when a machine learning model learns the training data too well, capturing noise and outliers, which results in poor generalization to new, unseen data.\n\n- **Symptoms**: Indications of overfitting include high accuracy on the training set but significantly lower accuracy on the validation or test set, suggesting that the model is too complex.\n\n- **Prevention Techniques**: Common strategies to mitigate overfitting include using simpler models, applying regularization techniques (like L1 or L2 penalties), and employing cross-validation and dropout in neural networks.', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1762069729, model='gpt-4o-mini-2024-07-18', object='chat.completion', service_tier='default', system_fingerprint='fp

In [7]:
messages = [
    {"role": "system", "content": "You are a Python tutor who answers with short code examples."},
    {"role": "user", "content": "Show how to reverse a string in Python."}
]
r = client.chat.completions.create(model="gpt-4o-mini", messages=messages, temperature=0)
print(r.choices[0].message.content)

You can reverse a string in Python using slicing. Here's a simple example:

```python
original_string = "Hello, World!"
reversed_string = original_string[::-1]
print(reversed_string)
```

This will output:

```
!dlroW ,olleH
```


In [8]:
r = client.chat.completions.create(model="gpt-4o-mini", messages=messages, temperature=1)
print(r.choices[0].message.content)

You can reverse a string in Python using slicing. Here's an example:

```python
original_string = "Hello, World!"
reversed_string = original_string[::-1]
print(reversed_string)  # Output: !dlroW ,olleH
```


In [9]:
from sys import stdout

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Write a short story about a cat and a dog."}],
    temperature=0.7,
    stream=True,
)

for event in stream:
    if hasattr(event, "choices"):
        delta = event.choices[0].delta
        if delta and delta.content:
            stdout.write(delta.content)
stdout.write("\n")

Once upon a time in a cozy little village, there lived a cat named Whiskers and a dog named Max. Whiskers was a sleek, gray tabby with striking green eyes, while Max was a jovial golden retriever with a heart as big as his bark. They lived on neighboring farms, and while their owners were friends, Whiskers and Max had never quite seen eye to eye.

Whiskers prided herself on her independence and grace. She spent her days prowling the rooftops, napping in sunbeams, and meticulously grooming her fur. Max, on the other hand, was all about fun and adventure. He loved to chase after sticks, dig in the dirt, and wag his tail at every passerby.

One bright summer morning, a ruckus erupted in the village square. A mischievous little squirrel was darting about, stealing shiny trinkets from the market stalls. The villagers were in a frenzy, trying to catch the speedy thief. Whiskers watched from her perch on a fence, her curiosity piqued. Max, with his boundless energy, leapt into action, barking

1

In [10]:
message = """I have bought 3 kg of Rice, 4 kg of dhal, 3 packets of biscuits, 2 kg of sugar.

Format this as a list of json objects with each JSON object in the following format:
{
    "item": "item name",
    "quantity": "quantity"
}

DO NOT include anything else in your response."""

completion = client.chat.completions.parse(
    model="gpt-4o-mini",
    temperature=0,
    messages=[
        {"role": "user", "content": message}
    ],
)

print(completion.choices[0].message.content)

[
    {
        "item": "Rice",
        "quantity": "3 kg"
    },
    {
        "item": "Dhal",
        "quantity": "4 kg"
    },
    {
        "item": "Biscuits",
        "quantity": "3 packets"
    },
    {
        "item": "Sugar",
        "quantity": "2 kg"
    }
]


In [11]:
import json

response = completion.choices[0].message.content

items = json.loads(response)
type(items)

list

In [12]:
for item in items:
    print(item["item"])

Rice
Dhal
Biscuits
Sugar


In [13]:
from typing import List
from pydantic import BaseModel


class Summary(BaseModel):
    topic: str
    key_points: List[str]

completion = client.chat.completions.parse(
    model="gpt-4o-mini",
    temperature=0,
    response_format=Summary,
    messages=[
        {"role": "user", "content": "Topic: Transformers in NLP. Give 3 key points."}
    ],
)

parsed = completion.choices[0].message.parsed
parsed

Summary(topic='Transformers in NLP', key_points=['Transformers utilize self-attention mechanisms to weigh the importance of different words in a sentence, allowing for better context understanding.', 'They enable parallel processing of data, significantly speeding up training times compared to traditional sequential models like RNNs.', 'Transformers have led to the development of powerful pre-trained models (e.g., BERT, GPT) that can be fine-tuned for various NLP tasks, achieving state-of-the-art results.'])

In [14]:
type(parsed)

__main__.Summary

In [15]:
message = """I have bought 3 kg of Rice, 4 kg of dhal, 3 packets of biscuits, 2 kg of sugar."""


class ShoppingItem(BaseModel):
    item: str
    quantity: int

class ShoppingList(BaseModel):
    items: List[ShoppingItem]

completion = client.chat.completions.parse(
    model="gpt-4o-mini",
    temperature=0,
    response_format=ShoppingList,
    messages=[
        {"role": "user", "content": message}
    ],
)

print(completion.choices[0].message.parsed)

items=[ShoppingItem(item='Rice', quantity=3), ShoppingItem(item='Dhal', quantity=4), ShoppingItem(item='Biscuits', quantity=3), ShoppingItem(item='Sugar', quantity=2)]


In [16]:
completion.choices[0].message.parsed

ShoppingList(items=[ShoppingItem(item='Rice', quantity=3), ShoppingItem(item='Dhal', quantity=4), ShoppingItem(item='Biscuits', quantity=3), ShoppingItem(item='Sugar', quantity=2)])

In [17]:
completion.choices[0].message.content

'{"items":[{"item":"Rice","quantity":3},{"item":"Dhal","quantity":4},{"item":"Biscuits","quantity":3},{"item":"Sugar","quantity":2}]}'