# 01 - Working with LLM APIs

**Build production-ready LLM integrations** with multiple providers.

## Learning Objectives

By the end of this notebook, you will:
- Set up and use multiple LLM providers
- Handle streaming responses
- Implement rate limiting and error handling
- Track costs and usage

## Table of Contents

1. [Setup & Configuration](#setup)
2. [OpenAI API](#openai)
3. [Anthropic API](#anthropic)
4. [Streaming Responses](#streaming)
5. [Cost Tracking](#cost)
6. [Error Handling](#errors)
7. [Exercises](#exercises)
8. [Checkpoint](#checkpoint)

In [None]:
# GUIDED: Setup
import os
import sys
import json
import time
from pathlib import Path

sys.path.append(str(Path.cwd().parent))

from dotenv import load_dotenv
load_dotenv(Path.cwd().parent / ".env")

print("Setup complete!")
print(f"OpenAI key: {'found' if os.getenv('OPENAI_API_KEY') else 'not found'}")
print(f"Anthropic key: {'found' if os.getenv('ANTHROPIC_API_KEY') else 'not found'}")

---
## 1. Setup & Configuration <a id='setup'></a>

Create a `.env` file in the `AI Engineering/` folder:

```
OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here
```

---
## 2. OpenAI API <a id='openai'></a>

In [None]:
# GUIDED: Basic OpenAI usage
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
    temperature=0.7,
    max_tokens=100
)

print("Response:", response.choices[0].message.content)
print(f"\nTokens used: {response.usage.total_tokens}")

---
## 3. Anthropic API <a id='anthropic'></a>

In [None]:
# GUIDED: Basic Anthropic usage
from anthropic import Anthropic

client = Anthropic()

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=100,
    system="You are a helpful assistant.",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print("Response:", response.content[0].text)
print(f"\nTokens: {response.usage.input_tokens} in, {response.usage.output_tokens} out")

---
## 4. Streaming Responses <a id='streaming'></a>

In [None]:
# GUIDED: Streaming with OpenAI
from openai import OpenAI

client = OpenAI()

print("Streaming response:")
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Write a haiku about coding."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

print("\n\nDone!")

---
## 5. Cost Tracking <a id='cost'></a>

In [None]:
# GUIDED: Use our LLM client with cost tracking
from src.llm_utils import LLMClient, estimate_cost

client = LLMClient(provider="openai", model="gpt-4o-mini")

# Make some requests
for i in range(3):
    response = client.chat(f"Tell me fact #{i+1} about AI.")
    print(f"Fact {i+1}: {response[:100]}...")

# Check usage
stats = client.get_stats()
print(f"\nUsage: {stats.summary()}")

---
## 6. Error Handling <a id='errors'></a>

In [None]:
# GUIDED: Robust error handling
from openai import OpenAI, RateLimitError, APIError
import time

def robust_completion(messages, max_retries=3):
    """Make a completion with retry logic."""
    client = OpenAI()
    
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=messages
            )
            return response.choices[0].message.content
            
        except RateLimitError:
            wait = 2 ** attempt  # Exponential backoff
            print(f"Rate limited. Waiting {wait}s...")
            time.sleep(wait)
            
        except APIError as e:
            print(f"API error: {e}")
            if attempt == max_retries - 1:
                raise
    
    raise Exception("Max retries exceeded")

# Test it
result = robust_completion([{"role": "user", "content": "Hello!"}])
print(result)

---
## 7. Exercises <a id='exercises'></a>

### Exercise 1: Multi-Provider Client

Create a function that tries OpenAI first, falls back to Anthropic if it fails.

In [None]:
# TODO: Implement fallback logic

# Your code here:


### Exercise 2: Cost Calculator

Create a function that estimates the cost before making a request.

In [None]:
# TODO: Estimate cost before request

# Your code here:


---
## 8. Checkpoint <a id='checkpoint'></a>

Before moving on, verify:

- [ ] You can use both OpenAI and Anthropic APIs
- [ ] You understand streaming responses
- [ ] You can track costs and usage
- [ ] You implemented error handling

### Next Steps

In the next notebook, we'll learn about **Structured Outputs** - getting reliable JSON and data from LLMs!