# Workout: LLM APIs

## Setup

Before starting, ensure you have API keys set:
```bash
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."
```

---
## Drill 1: Basic OpenAI Call 游릭

**Task:** Make a simple chat completion call to GPT-4o

In [None]:
from openai import OpenAI

client = OpenAI()

# Make a call asking "What is the capital of France?"
# Print the response text


---
## Drill 2: Token Counting 游릭

**Task:** Count tokens in a message before sending

**Expected:** ~40-50 tokens

In [None]:
import tiktoken

text = """
Python is a high-level, interpreted programming language known for
its clear syntax and readability. It was created by Guido van Rossum
and first released in 1991.
"""

# Count the tokens for gpt-4o model
# Print the count


---
## Drill 3: System Prompt 游리

**Task:** Create a call with a system prompt that makes the AI respond only in haiku format

In [None]:
from openai import OpenAI

client = OpenAI()

# Create a system prompt that instructs the AI to
# respond ONLY in haiku format (5-7-5 syllables)
# Ask about Python programming


---
## Drill 4: Temperature Comparison 游리

**Task:** Make 3 calls with different temperatures and compare outputs

**Expected:** temp=0 gives same output each time, temp=1.5 gives wild variety

In [None]:
from openai import OpenAI

client = OpenAI()

prompt = "Generate a creative name for a coffee shop"

# Make 3 calls with temperature: 0, 0.7, 1.5
# Print all outputs and observe differences


---
## Drill 5: Anthropic Call 游릭

**Task:** Make a basic Claude API call

In [None]:
from anthropic import Anthropic

client = Anthropic()

# Ask Claude "Explain what an API is in one sentence"
# Note: system prompt is a separate parameter!


---
## Drill 6: Streaming Response 游리

**Task:** Implement streaming output for OpenAI

In [None]:
from openai import OpenAI

client = OpenAI()

# Make a streaming call asking for a short story
# Print each chunk as it arrives


---
## Drill 7: Error Handling 游리

**Task:** Handle rate limit errors with retry

In [None]:
from openai import OpenAI, RateLimitError
from tenacity import retry, stop_after_attempt, wait_exponential

client = OpenAI()

# Create a function that:
# 1. Makes an API call
# 2. Retries 3 times on RateLimitError
# 3. Uses exponential backoff


---
## Drill 8: Token Budget 游댮

**Task:** Implement a function that truncates messages to fit a token budget

In [None]:
import tiktoken

def fit_to_budget(
    messages: list[dict],
    max_tokens: int,
    model: str = "gpt-4o"
) -> list[dict]:
    """
    Keep only the most recent messages that fit within max_tokens.
    Always keep the system message if present.
    """
    pass

# Test with a long conversation
messages = [
    {"role": "system", "content": "You are helpful."},
    {"role": "user", "content": "First message " * 100},
    {"role": "assistant", "content": "First response " * 100},
    {"role": "user", "content": "Second message"},
]

result = fit_to_budget(messages, max_tokens=200)
# Should keep system + most recent that fits

---
## Drill 9: Provider Abstraction 游댮

**Task:** Create a simple abstraction that works with both OpenAI and Anthropic

In [None]:
from abc import ABC, abstractmethod

class LLMProvider(ABC):
    @abstractmethod
    def complete(self, prompt: str) -> str:
        pass

class OpenAIProvider(LLMProvider):
    pass

class AnthropicProvider(LLMProvider):
    pass

# Test both providers with same prompt
# providers = [OpenAIProvider(), AnthropicProvider()]
# for p in providers:
#     print(p.complete("What is 2+2?"))

---
## Drill 10: Cost Estimation 游리

**Task:** Estimate the cost before making a call

In [None]:
import tiktoken

def estimate_cost(
    messages: list[dict],
    model: str = "gpt-4o",
    max_output_tokens: int = 500
) -> dict:
    """
    Return dict with:
    - input_tokens
    - estimated_output_tokens
    - estimated_cost_usd

    Pricing (per 1M tokens):
    - gpt-4o: input=$2.50, output=$10.00
    - gpt-4o-mini: input=$0.15, output=$0.60
    """
    pass

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain quantum computing in detail."}
]

result = estimate_cost(messages, max_output_tokens=1000)
# print(f"Estimated cost: ${result['estimated_cost_usd']:.4f}")

---
## Self-Check

- [ ] Can make calls to OpenAI, Anthropic, Google
- [ ] Understand the difference between system and user prompts
- [ ] Can count tokens and estimate costs
- [ ] Can implement retry logic for API errors
- [ ] Can stream responses for better UX