# Concurrent API Calls

## Introduction

In this notebook, you'll learn to make concurrent API calls efficiently - a critical skill for AI engineering.

**Real-world scenario:** Process 100 text snippets with OpenAI API
- Sequential: ~500 seconds (8+ minutes)
- Concurrent: ~50 seconds (under 1 minute)
- **10x speedup!**

## Learning Objectives

1. Make concurrent HTTP requests with `httpx`
2. Handle errors in concurrent scenarios
3. Measure and compare performance
4. Apply to real AI API calls (OpenAI)

In [None]:
import asyncio
import time
from typing import List
import httpx  # Async HTTP client

## 1. Sequential vs Concurrent Requests

### Sequential Requests (Slow)

In [None]:
def fetch_sync(url: str) -> dict:
    """Synchronous HTTP GET request."""
    with httpx.Client() as client:
        response = client.get(url, timeout=10.0)
        return response.json()

# Test with JSONPlaceholder API
urls = [
    "https://jsonplaceholder.typicode.com/posts/1",
    "https://jsonplaceholder.typicode.com/posts/2",
    "https://jsonplaceholder.typicode.com/posts/3",
    "https://jsonplaceholder.typicode.com/posts/4",
    "https://jsonplaceholder.typicode.com/posts/5",
]

start = time.time()
results = [fetch_sync(url) for url in urls]
elapsed = time.time() - start

print(f"Sequential: {elapsed:.2f}s for {len(urls)} requests")
print(f"First result: {results[0]['title']}")

### Concurrent Requests (Fast)

In [None]:
async def fetch_async(url: str) -> dict:
    """Asynchronous HTTP GET request."""
    async with httpx.AsyncClient() as client:
        response = await client.get(url, timeout=10.0)
        return response.json()

start = time.time()
results = await asyncio.gather(*[fetch_async(url) for url in urls])
elapsed = time.time() - start

print(f"Concurrent: {elapsed:.2f}s for {len(urls)} requests")
print(f"First result: {results[0]['title']}")
print(f"\nSpeedup: {elapsed:.2f}s vs previous (approximately {len(urls)/elapsed:.1f}x faster)")

## 2. Error Handling in Concurrent Requests

When making concurrent requests, errors become more likely. Handle them gracefully!

In [None]:
async def fetch_with_error_handling(url: str) -> dict:
    """Fetch with comprehensive error handling."""
    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(url, timeout=10.0)
            response.raise_for_status()  # Raise error for 4xx/5xx
            return {"success": True, "data": response.json(), "url": url}
    
    except httpx.TimeoutException:
        return {"success": False, "error": "timeout", "url": url}
    
    except httpx.HTTPStatusError as e:
        return {"success": False, "error": f"HTTP {e.response.status_code}", "url": url}
    
    except Exception as e:
        return {"success": False, "error": str(e), "url": url}

# Test with mix of valid and invalid URLs
test_urls = [
    "https://jsonplaceholder.typicode.com/posts/1",
    "https://jsonplaceholder.typicode.com/posts/999999",  # 404
    "https://jsonplaceholder.typicode.com/posts/2",
]

results = await asyncio.gather(*[fetch_with_error_handling(url) for url in test_urls])

for result in results:
    if result["success"]:
        print(f"✓ {result['url']}: {result['data']['title'][:30]}...")
    else:
        print(f"✗ {result['url']}: {result['error']}")

## 3. Batch Processing Pattern

Process large lists of items efficiently with progress tracking.

In [None]:
async def batch_fetch(urls: List[str], batch_size: int = 10) -> List[dict]:
    """
    Fetch URLs in batches to avoid overwhelming the server.
    
    Args:
        urls: List of URLs to fetch
        batch_size: Number of concurrent requests per batch
    
    Returns:
        List of results
    """
    results = []
    
    for i in range(0, len(urls), batch_size):
        batch = urls[i:i + batch_size]
        print(f"Processing batch {i//batch_size + 1}/{(len(urls)-1)//batch_size + 1}...")
        
        batch_results = await asyncio.gather(
            *[fetch_with_error_handling(url) for url in batch]
        )
        results.extend(batch_results)
    
    return results

# Test with 15 URLs in batches of 5
urls = [f"https://jsonplaceholder.typicode.com/posts/{i}" for i in range(1, 16)]

start = time.time()
results = await batch_fetch(urls, batch_size=5)
elapsed = time.time() - start

successful = sum(1 for r in results if r["success"])
print(f"\nCompleted: {successful}/{len(results)} successful in {elapsed:.2f}s")

## 4. Real Example: OpenAI API Calls

**Note:** This example shows the pattern. You'll need a valid API key to run it.

In [None]:
import os
from openai import AsyncOpenAI

# Initialize async client
client = AsyncOpenAI(api_key=os.getenv("OPENAI_API_KEY"))

async def generate_completion(prompt: str) -> dict:
    """
    Generate completion for a single prompt.
    
    Args:
        prompt: Text prompt
    
    Returns:
        Result dictionary with success status and content
    """
    try:
        response = await client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}],
            max_tokens=50
        )
        
        return {
            "success": True,
            "prompt": prompt,
            "response": response.choices[0].message.content,
            "tokens": response.usage.total_tokens
        }
    
    except Exception as e:
        return {
            "success": False,
            "prompt": prompt,
            "error": str(e)
        }

# Example prompts
prompts = [
    "What is Python?",
    "Explain async programming.",
    "What are the benefits of AI?",
    "How does machine learning work?",
    "What is a neural network?"
]

# Only run if API key is available
if os.getenv("OPENAI_API_KEY"):
    print("Processing prompts concurrently...")
    start = time.time()
    
    results = await asyncio.gather(*[generate_completion(p) for p in prompts])
    
    elapsed = time.time() - start
    
    print(f"\nCompleted {len(prompts)} prompts in {elapsed:.2f}s")
    
    for result in results:
        if result["success"]:
            print(f"\n✓ Prompt: {result['prompt']}")
            print(f"  Response: {result['response'][:100]}...")
            print(f"  Tokens: {result['tokens']}")
        else:
            print(f"\n✗ Prompt: {result['prompt']}")
            print(f"  Error: {result['error']}")
else:
    print("⚠ Skipping OpenAI example (no API key)")
    print("Set OPENAI_API_KEY environment variable to run this example")

## 5. Performance Comparison

Let's measure the actual speedup with different batch sizes.

In [None]:
async def performance_test(num_requests: int, concurrent: bool = True):
    """Test performance of sequential vs concurrent requests."""
    urls = [f"https://jsonplaceholder.typicode.com/posts/{i % 100 + 1}" for i in range(num_requests)]
    
    start = time.time()
    
    if concurrent:
        results = await asyncio.gather(*[fetch_async(url) for url in urls])
    else:
        results = []
        for url in urls:
            result = await fetch_async(url)
            results.append(result)
    
    elapsed = time.time() - start
    return elapsed, len(results)

# Test with different batch sizes
test_sizes = [5, 10, 20]

print("Performance Comparison:\n")
print(f"{'Requests':<12} {'Sequential':<15} {'Concurrent':<15} {'Speedup':<10}")
print("-" * 60)

for size in test_sizes:
    seq_time, _ = await performance_test(size, concurrent=False)
    conc_time, _ = await performance_test(size, concurrent=True)
    speedup = seq_time / conc_time
    
    print(f"{size:<12} {seq_time:<15.2f} {conc_time:<15.2f} {speedup:.1f}x")

## 6. Best Practices

### ✅ Do:

1. **Use connection pooling** (httpx.AsyncClient reuses connections)
2. **Handle errors individually** (don't let one failure stop all)
3. **Set timeouts** (prevent hanging forever)
4. **Respect rate limits** (use semaphores, covered in next notebook)
5. **Track progress** (for long-running batches)

### ❌ Don't:

1. **Make unlimited concurrent requests** (overwhelm servers)
2. **Ignore errors** (handle them gracefully)
3. **Use sync libraries** (blocks the event loop)
4. **Skip timeouts** (requests can hang)
5. **Forget to close clients** (use async context managers)

## 7. Practice Exercise

Create a function that fetches user data and their posts concurrently:

In [None]:
async def fetch_user_with_posts(user_id: int) -> dict:
    """
    Fetch user data and all their posts concurrently.
    
    Args:
        user_id: User ID to fetch
    
    Returns:
        Dictionary with user info and posts
    """
    # TODO: Implement this function
    # 1. Fetch user from: https://jsonplaceholder.typicode.com/users/{user_id}
    # 2. Fetch user's posts from: https://jsonplaceholder.typicode.com/posts?userId={user_id}
    # 3. Use asyncio.gather to fetch both concurrently
    # 4. Return combined result
    pass

# Test your implementation
# result = await fetch_user_with_posts(1)
# print(f"User: {result['user']['name']}")
# print(f"Posts: {len(result['posts'])}")

## Summary

### Key Takeaways:

1. **httpx.AsyncClient for async HTTP requests**
2. **asyncio.gather() for concurrent execution**
3. **Always handle errors in concurrent code**
4. **Batch processing prevents overwhelming servers**
5. **10x+ speedups common for I/O-bound operations**

### Real-World Impact:

- Process 100 prompts: 500s → 50s
- Check 50 endpoints: 100s → 10s
- Fetch 1000 documents: hours → minutes

### Next Notebook:

**Async Patterns for AI** - Rate limiting, semaphores, timeouts, and production patterns

### Resources:

- [httpx async documentation](https://www.python-httpx.org/async/)
- [OpenAI async examples](https://github.com/openai/openai-python)
- [asyncio patterns](https://www.roguelynn.com/words/asyncio-we-did-it-wrong/)