# OpenAI API - Middle Level Demo

Welcome to the **Middle Level** OpenAI API tutorial! Building on the fundamentals, this notebook covers:

1. **Streaming Responses** - Real-time token-by-token output
2. **Function Calling / Tools** - Let the model call your functions
3. **Structured Outputs** - Get consistent JSON responses
4. **Vision Capabilities** - Analyze images with GPT-4o
5. **Advanced Conversation Patterns** - Context management and optimization

## Prerequisites
- Completed Entry Level notebook
- Familiarity with basic Chat Completions API
- OpenAI API key configured

---

## Reference Documentation
- [Streaming](https://platform.openai.com/docs/api-reference/streaming)
- [Function Calling](https://platform.openai.com/docs/guides/function-calling)
- [Vision](https://platform.openai.com/docs/guides/vision)
- [Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs)

In [3]:
# Setup - Import libraries and initialize client
import os
import json
import base64
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

# =============================================================================
# GLOBAL CONFIGURATION
# =============================================================================
# Set the model to use throughout this notebook
MODEL = "gpt-4o-mini"  # Change to "gpt-4o" for more capable model
# =============================================================================

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

print("✓ OpenAI client initialized!")
print(f"✓ Using model: {MODEL}")

✓ OpenAI client initialized!
✓ Using model: gpt-4o-mini


---

## 1. Streaming Responses

Streaming allows you to receive the response token-by-token as it's generated, rather than waiting for the complete response. This is essential for:
- Better user experience (immediate feedback)
- Long-form content generation
- Chat applications

In [4]:
# Basic streaming example
print("Streaming response:")
print("-" * 50)

stream = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "user", "content": "Write a haiku about programming."}
    ],
    stream=True  # Enable streaming
)

# Process the stream
full_response = ""
for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        content = chunk.choices[0].delta.content
        print(content, end="", flush=True)  # Print each token as it arrives
        full_response += content

print("\n" + "-" * 50)
print(f"\nFull response collected: {full_response}")

Streaming response:
--------------------------------------------------
Lines of code unfold,  
Logic dances in the screen,  
Dreams in syntax gleam.
--------------------------------------------------

Full response collected: Lines of code unfold,  
Logic dances in the screen,  
Dreams in syntax gleam.


In [5]:
# Streaming with usage statistics
# Note: You need to request usage stats explicitly with stream_options
print("Streaming with usage tracking:")
print("-" * 50)

stream = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "user", "content": "Explain recursion in one sentence."}
    ],
    stream=True,
    stream_options={"include_usage": True}  # Request usage stats
)

for chunk in stream:
    # Content chunks
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
    
    # Final chunk contains usage stats
    if chunk.usage:
        print(f"\n\nUsage: {chunk.usage.prompt_tokens} prompt + {chunk.usage.completion_tokens} completion = {chunk.usage.total_tokens} total tokens")

Streaming with usage tracking:
--------------------------------------------------
Recursion is a programming technique where a function calls itself to solve a problem by breaking it down into smaller, more manageable subproblems.

Usage: 13 prompt + 28 completion = 41 total tokens


---

## 2. Function Calling / Tools

Function calling allows the model to intelligently choose to call functions you define. The model doesn't execute code - it returns structured arguments for YOU to execute.

**Use cases:**
- Fetching real-time data (weather, stocks, databases)
- Performing calculations
- Interacting with external APIs
- Taking actions in your application

In [6]:
# Define your tools (functions the model can call)
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City and state, e.g., San Francisco, CA"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["location"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Perform a mathematical calculation",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Mathematical expression to evaluate, e.g., '2 + 2 * 3'"
                    }
                },
                "required": ["expression"]
            }
        }
    }
]

print("Tools defined: get_weather, calculate")

Tools defined: get_weather, calculate


In [7]:
# Simulated function implementations
def get_weather(location, unit="fahrenheit"):
    """Simulated weather API call"""
    # In real applications, you'd call an actual weather API
    weather_data = {
        "San Francisco, CA": {"temp": 65, "condition": "Foggy"},
        "New York, NY": {"temp": 45, "condition": "Cloudy"},
        "Miami, FL": {"temp": 82, "condition": "Sunny"},
    }
    data = weather_data.get(location, {"temp": 70, "condition": "Clear"})
    if unit == "celsius":
        data["temp"] = round((data["temp"] - 32) * 5/9)
    return json.dumps({"location": location, "temperature": data["temp"], "unit": unit, "condition": data["condition"]})

def calculate(expression):
    """Safe mathematical calculation"""
    try:
        # Warning: eval is used here for demo only. Use a safe parser in production!
        result = eval(expression, {"__builtins__": {}}, {})
        return json.dumps({"expression": expression, "result": result})
    except Exception as e:
        return json.dumps({"error": str(e)})

# Map function names to implementations
available_functions = {
    "get_weather": get_weather,
    "calculate": calculate
}

print("Function implementations ready!")

Function implementations ready!


In [8]:
# Make a request that triggers function calling
messages = [
    {"role": "user", "content": "What's the weather in San Francisco? Also, what's 15% of 89?"}
]

response = client.chat.completions.create(
    model=MODEL,
    messages=messages,
    tools=tools,
    tool_choice="auto"  # Let the model decide when to use tools
)

# Check if the model wants to call functions
assistant_message = response.choices[0].message
print("Model's response:")
print(f"  Content: {assistant_message.content}")
print(f"  Tool calls: {len(assistant_message.tool_calls) if assistant_message.tool_calls else 0}")

Model's response:
  Content: None
  Tool calls: 2


In [9]:
# Complete function calling flow - execute functions and return results
if assistant_message.tool_calls:
    # Add assistant's message (with tool calls) to history
    messages.append(assistant_message)
    
    # Process each tool call
    for tool_call in assistant_message.tool_calls:
        function_name = tool_call.function.name
        function_args = json.loads(tool_call.function.arguments)
        
        print(f"\nExecuting: {function_name}({function_args})")
        
        # Call the actual function
        function_to_call = available_functions[function_name]
        function_response = function_to_call(**function_args)
        
        print(f"Result: {function_response}")
        
        # Add function result to messages
        messages.append({
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": function_response
        })
    
    # Get final response with function results
    final_response = client.chat.completions.create(
        model=MODEL,
        messages=messages
    )
    
    print("\n" + "=" * 50)
    print("Final Response:")
    print(final_response.choices[0].message.content)


Executing: get_weather({'location': 'San Francisco, CA'})
Result: {"location": "San Francisco, CA", "temperature": 65, "unit": "fahrenheit", "condition": "Foggy"}

Executing: calculate({'expression': '0.15 * 89'})
Result: {"expression": "0.15 * 89", "result": 13.35}

Final Response:
The weather in San Francisco is currently 65°F and foggy. 

As for your calculation, 15% of 89 is 13.35.


---

## 3. Structured Outputs (JSON Mode)

Get consistent, parseable JSON responses from the model. This is crucial for:
- API integrations
- Data extraction
- Consistent formatting

In [10]:
# Method 1: Simple JSON mode
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {
            "role": "system",
            "content": "You extract information and return it as JSON."
        },
        {
            "role": "user",
            "content": "Extract the person's name, age, and occupation from: 'John Smith is a 32-year-old software engineer from Seattle.'"
        }
    ],
    response_format={"type": "json_object"}  # Enable JSON mode
)

result = json.loads(response.choices[0].message.content)
print("Extracted JSON:")
print(json.dumps(result, indent=2))

Extracted JSON:
{
  "name": "John Smith",
  "age": 32,
  "occupation": "software engineer"
}


In [13]:
# Method 2: Structured Outputs with JSON Schema (more reliable)
# This ensures the output strictly follows your schema
from pydantic import BaseModel
from typing import Optional

class PersonInfo(BaseModel):
    name: str
    age: int
    occupation: str
    city: Optional[str] = None

response = client.beta.chat.completions.parse(
    model=MODEL,
    messages=[
        {
            "role": "system",
            "content": "Extract person information from the text."
        },
        # {
        #     "role": "user",
        #     "content": "Maria Garcia is a 28-year-old data scientist living in Boston."
        # }
        {
            "role": "user",
            "content": "Maria Garcia is a 28-year-old data scientist."
        }
    ],
    response_format=PersonInfo
)

person = response.choices[0].message.parsed
print(f"Name: {person.name}")
print(f"Age: {person.age}")
print(f"Occupation: {person.occupation}")
print(f"City: {person.city}")

Name: Maria Garcia
Age: 28
Occupation: data scientist
City: None


---

## 4. Vision Capabilities

GPT-4o and GPT-4o-mini can understand images! You can:
- Analyze images
- Describe visual content
- Extract text from images (OCR)
- Answer questions about images

In [None]:

# Vision: Analyze an image from URL
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Describe this image in detail. What do you see?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": image_url,
                        "detail": "auto"  # Options: "low", "high", "auto"
                    }
                }
            ]
        }
    ],
    max_tokens=300
)

print("Image Analysis:")
print(response.choices[0].message.content)

Image Analysis:
The image features a serene outdoor setting. A wooden boardwalk, which appears to be well-maintained, winds through a lush landscape of tall, green grass. The boardwalk stretches towards the horizon, inviting viewers to explore further into the scene.

On either side of the boardwalk, the grass is vibrant and appears to sway gently in the breeze. There are also some bushes and trees in the background, suggesting a diverse ecosystem. The sky above is expansive and painted with a soft gradient of blue, punctuated by scattered fluffy white clouds that create a sense of tranquility. The overall lighting is warm, perhaps indicating either morning or late afternoon, adding to the peaceful ambiance of the setting.


In [18]:
# Vision: Analyze a local image (base64 encoded)
def encode_image_to_base64(image_path):
    """Encode a local image file to base64."""
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

# Example usage (uncomment when you have a local image):
image_path = "sunset.jpg"
base64_image = encode_image_to_base64(image_path)

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{base64_image}"
                    }
                }
            ]
        }
    ]
)

print("Base64 encoding function ready for local images!")
print(response.choices[0].message.content)

Base64 encoding function ready for local images!
The image features a serene lakeside scene at sunset. There are two wooden benches positioned on a grassy area near the water, with mountains in the background. The sky is painted in vibrant hues of orange, blue, and purple, reflecting on the calm surface of the lake. The overall atmosphere appears peaceful and picturesque.


---

## 5. Advanced Conversation Patterns

Managing conversation context effectively is crucial for production applications.

In [19]:
# Context Window Management - Sliding window approach
class ConversationManager:
    def __init__(self, max_messages=20, system_prompt="You are a helpful assistant."):
        self.system_message = {"role": "system", "content": system_prompt}
        self.messages = []
        self.max_messages = max_messages
    
    def add_message(self, role, content):
        """Add a message and trim if needed."""
        self.messages.append({"role": role, "content": content})
        
        # Keep only the last N messages (sliding window)
        if len(self.messages) > self.max_messages:
            self.messages = self.messages[-self.max_messages:]
    
    def get_messages(self):
        """Get full message list for API call."""
        return [self.system_message] + self.messages
    
    def summarize_and_reset(self, client):
        """Summarize conversation and start fresh with context."""
        if len(self.messages) < 4:
            return
        
        # Ask AI to summarize the conversation
        summary_request = self.get_messages() + [
            {"role": "user", "content": "Summarize our conversation so far in 2-3 sentences."}
        ]
        
        response = client.chat.completions.create(
            model=MODEL,
            messages=summary_request,
            max_tokens=150
        )
        
        summary = response.choices[0].message.content
        
        # Reset with summary as context
        self.messages = [
            {"role": "assistant", "content": f"[Previous conversation summary: {summary}]"}
        ]
        
        return summary

# Demo the conversation manager
conv = ConversationManager(max_messages=10, system_prompt="You are a tech support assistant.")
conv.add_message("user", "My computer won't start")
conv.add_message("assistant", "Let's troubleshoot. Is it plugged in?")
conv.add_message("user", "Yes it is")

print("Messages in context:", len(conv.get_messages()))
print("Ready for API call with:", conv.get_messages())

Messages in context: 4
Ready for API call with: [{'role': 'system', 'content': 'You are a tech support assistant.'}, {'role': 'user', 'content': "My computer won't start"}, {'role': 'assistant', 'content': "Let's troubleshoot. Is it plugged in?"}, {'role': 'user', 'content': 'Yes it is'}]


In [20]:
# Parallel function calls for efficiency
# When a user asks multiple questions, the model can call multiple functions at once

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "user", "content": "What's the weather in New York and Miami?"}
    ],
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True  # Enable parallel function calls (default)
)

if response.choices[0].message.tool_calls:
    print(f"Model requested {len(response.choices[0].message.tool_calls)} parallel function calls:")
    for tc in response.choices[0].message.tool_calls:
        print(f"  - {tc.function.name}: {tc.function.arguments}")

Model requested 2 parallel function calls:
  - get_weather: {"location": "New York, NY"}
  - get_weather: {"location": "Miami, FL"}


---

## Summary

You've now learned intermediate OpenAI API techniques:

1. **Streaming**: Real-time token-by-token responses for better UX
2. **Function Calling**: Let the model invoke your functions with structured arguments
3. **Structured Outputs**: Get consistent JSON responses with schemas
4. **Vision**: Analyze and understand images
5. **Conversation Management**: Handle context windows and message history

### Key Takeaways
- Streaming is essential for chat applications
- Function calling enables powerful integrations with external tools
- Structured outputs ensure reliable data extraction
- Vision capabilities open up multimodal applications
- Proper context management prevents token limit issues

### Next Steps
Move on to **03_advanced_level_openai_api.ipynb** to learn:
- Building AI agents
- Embeddings and semantic search
- Batch processing
- Production patterns and optimization