# üéì Week 17 - Day 3: Advanced FastAPI - Streaming, Security & Production

## Today's Goals:
‚úÖ Implement Server-Sent Events (SSE) for streaming responses

‚úÖ Stream LLM outputs in real-time (ChatGPT-style)

‚úÖ Manage secrets with environment variables

‚úÖ Customize API documentation (Swagger UI)

‚úÖ Implement security best practices

‚úÖ Prepare APIs for production deployment

---

## üîß Part 1: Setup - Install Packages

**What we're installing:**
- `fastapi` & `uvicorn` - API framework (continuing from Days 1-2)
- `python-dotenv` - Environment variable management
- `slowapi` - Rate limiting for security
- Standard libraries - asyncio, os, logging

**‚è±Ô∏è This will take about 30 seconds**

In [None]:
# STEP 1: Install packages
print("üì¶ Installing FastAPI and security packages...\n")

!pip install -q fastapi uvicorn[standard]
!pip install -q python-dotenv
!pip install -q slowapi

print("\n‚úÖ All packages installed successfully!")
print("\nüí° What we installed:")
print("   ‚Ä¢ FastAPI - API framework")
print("   ‚Ä¢ python-dotenv - Environment variables")
print("   ‚Ä¢ slowapi - Rate limiting")

In [None]:
# STEP 2: Import all libraries
import warnings
warnings.filterwarnings('ignore')

# FastAPI essentials
from fastapi import FastAPI, Request, HTTPException
from fastapi.responses import StreamingResponse
from pydantic import BaseModel, Field
from typing import List, Optional

# Environment and security
from dotenv import load_dotenv
import os

# Rate limiting
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded

# Async and streaming
import asyncio
import time
from datetime import datetime

# Server utilities
import uvicorn
from threading import Thread
import requests
import json

# Logging
import logging

print("‚úÖ All libraries imported successfully!")
print("\nüéØ Ready for advanced FastAPI features!")

---

## üì° Part 2: Understanding Server-Sent Events (SSE)

### ü§î What is Streaming?

**Traditional API (Request-Response):**
```
Client: "Give me the weather"
         ‚Üì
    [Waiting...]
         ‚Üì
Server: "Here's the complete response: Sunny, 75¬∞F"
```

**Streaming API (Server-Sent Events):**
```
Client: "Generate a story"
         ‚Üì
Server: "Once" ... "upon" ... "a" ... "time" ...
         ‚Üë
    (Real-time streaming!)
```

### üéØ Real-World Examples:

**ChatGPT-style responses:**
- User asks question
- AI generates answer word-by-word
- User sees text appearing in real-time

**Live updates:**
- Stock prices updating
- Sports scores streaming
- Social media feeds

**Long-running tasks:**
- Processing large files
- Training ML models
- Batch operations

### üí° Why SSE?

**Advantages:**
- ‚úÖ **Better UX** - Users see progress, not just waiting
- ‚úÖ **Simple** - Works over standard HTTP
- ‚úÖ **One-way** - Server ‚Üí Client (perfect for AI)
- ‚úÖ **Automatic reconnection** - Browsers handle this

**SSE vs WebSockets:**

| Feature | SSE | WebSockets |
|---------|-----|------------|
| **Direction** | Server ‚Üí Client only | Bidirectional |
| **Protocol** | HTTP | ws:// protocol |
| **Complexity** | Simple | More complex |
| **Use Case** | Notifications, AI streaming | Chat, gaming |

**üí° For AI APIs, SSE is usually perfect!**

---

## üöÄ Part 3: Your First Streaming Endpoint

Let's build a simple streaming endpoint that sends messages one by one!

**What we're building:**
- Endpoint that counts from 1 to 10
- Sends each number with a delay
- Client receives them in real-time

**Key concepts:**
- `async def` - Asynchronous function
- `yield` - Send data incrementally
- `StreamingResponse` - FastAPI's streaming class
- SSE format: `"data: <content>\n\n"`

In [None]:
# Create FastAPI app
print("üöÄ Creating Advanced FastAPI application...\n")

app = FastAPI(
    title="Advanced API with Streaming",
    description="""üî• Production-ready API featuring:
    
    - Server-Sent Events (SSE) for streaming
    - LLM-style text generation
    - Environment variable security
    - Rate limiting
    - Comprehensive logging
    """,
    version="3.0.0",
    contact={
        "name": "AI Bootcamp Team",
        "email": "support@aibootcamp.com"
    }
)

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

print("‚úÖ FastAPI app created with:")
print("   ‚Ä¢ Custom title and description")
print("   ‚Ä¢ Version tracking")
print("   ‚Ä¢ Contact information")
print("   ‚Ä¢ Logging configured")

In [None]:
# Create a simple streaming endpoint
print("üì° Adding streaming endpoints...\n")

async def simple_stream():
    """
    Generator function that yields data for SSE.
    
    Key points:
    - Must be async
    - Use 'yield' not 'return'
    - Format: "data: <content>\n\n"
    """
    for i in range(1, 11):
        # Format for SSE: "data: " prefix + double newline
        message = f"data: Count: {i}\n\n"
        yield message
        
        # Simulate work (in real apps, this is model generation)
        await asyncio.sleep(0.5)
    
    # Send completion message
    yield "data: [DONE]\n\n"

@app.get("/stream-simple")
async def stream_simple():
    """
    Simple streaming endpoint that counts from 1 to 10.
    
    Returns:
    StreamingResponse with text/event-stream media type
    
    Try in browser: http://localhost:8003/stream-simple
    You'll see numbers appear one by one!
    """
    logger.info("Simple stream requested")
    
    return StreamingResponse(
        simple_stream(),
        media_type="text/event-stream",
        headers={
            "Cache-Control": "no-cache",
            "Connection": "keep-alive"
        }
    )

print("‚úÖ Simple streaming endpoint added!")
print("\nüìã Endpoint: GET /stream-simple")
print("   ‚Ä¢ Counts from 1 to 10")
print("   ‚Ä¢ 0.5 second delay between numbers")
print("   ‚Ä¢ Uses SSE format")
print("\nüí° SSE Format Explained:")
print("   'data: <message>\\n\\n'")
print("   ‚Üë")
print("   Must start with 'data: ' and end with double newline!")

### üéØ Understanding the Code:

**1. Async Generator Function:**
```python
async def simple_stream():
    yield "data: message\n\n"
```
- `async def` - Can use `await` inside
- `yield` - Sends data incrementally (not `return`!)
- Function becomes a **generator**

**2. SSE Format:**
```python
"data: Count: 1\n\n"
 ‚Üë     ‚Üë         ‚Üë
 |     |         Double newline (required!)
 |     Your message
 SSE prefix (required!)
```

**3. StreamingResponse:**
```python
StreamingResponse(
    simple_stream(),           # Generator function
    media_type="text/event-stream"  # SSE content type
)
```

**üí° Think of it like:**
- Regular response = Sending a complete email
- Streaming response = Live video call

**Why `await asyncio.sleep(0.5)`?**
- Simulates time-consuming work (like AI generation)
- In real apps: model inference, database queries, etc.
- Gives other requests a chance to be processed

---

## ü§ñ Part 4: LLM-Style Text Streaming

Now let's build something more realistic - streaming text generation like ChatGPT!

**What we're building:**
- Takes a prompt from the user
- Generates response word-by-word
- Streams to client in real-time

**In production:**
- You'd use actual LLM (OpenAI, Anthropic, etc.)
- We'll simulate it for learning!

In [None]:
# Pydantic model for streaming requests
print("üìã Creating Pydantic models for streaming...\n")

class StreamRequest(BaseModel):
    """
    Input model for streaming text generation.
    """
    prompt: str = Field(
        ...,
        min_length=1,
        max_length=1000,
        description="User prompt for text generation",
        example="Write a short story about a robot"
    )
    max_tokens: Optional[int] = Field(
        default=100,
        ge=10,
        le=500,
        description="Maximum words to generate"
    )
    
    class Config:
        schema_extra = {
            "example": {
                "prompt": "Tell me a joke about programming",
                "max_tokens": 50
            }
        }

print("‚úÖ Pydantic models created!")
print("\nüí° Model Features:")
print("   ‚Ä¢ Prompt validation (length limits)")
print("   ‚Ä¢ Optional max_tokens parameter")
print("   ‚Ä¢ Example data for Swagger UI")

In [None]:
# Simulate LLM text generation
print("ü§ñ Creating LLM-style streaming generator...\n")

async def generate_text_stream(prompt: str, max_tokens: int = 100):
    """
    Simulates LLM text generation with streaming.
    
    In production, replace this with actual LLM API calls:
    - OpenAI GPT-4
    - Anthropic Claude
    - Local models (Llama, etc.)
    
    This simulates word-by-word generation.
    """
    # Simulated responses based on prompt keywords
    responses = {
        "joke": "Why do programmers prefer dark mode? Because light attracts bugs! üòÑ",
        "story": "Once upon a time, in a world of circuits and code, there lived a curious AI named Claude. Claude loved to help humans learn new things. Every day, Claude would answer questions, write code, and explain complex concepts in simple terms. The end! üìö",
        "poem": "Roses are red, Violets are blue, FastAPI is awesome, And so are you! üåπ",
        "default": "This is a simulated AI response to your prompt. In production, this would be generated by a real language model like GPT-4 or Claude. The response would be contextual and based on your specific prompt. For now, this demonstrates how streaming works! ‚ú®"
    }
    
    # Select response based on prompt
    response_text = responses["default"]
    for key in responses:
        if key in prompt.lower():
            response_text = responses[key]
            break
    
    # Split into words and stream
    words = response_text.split()
    words = words[:max_tokens]  # Respect max_tokens
    
    # Send metadata first
    yield f"data: {{\"type\": \"start\", \"prompt\": \"{prompt}\"}}\n\n"
    
    # Stream each word
    for i, word in enumerate(words):
        # Create JSON message
        message = {
            "type": "token",
            "content": word + " ",
            "index": i
        }
        yield f"data: {json.dumps(message)}\n\n"
        
        # Simulate generation time (50-150ms per word)
        await asyncio.sleep(0.05 + (i % 3) * 0.05)
    
    # Send completion
    completion = {
        "type": "done",
        "total_tokens": len(words),
        "finish_reason": "completed"
    }
    yield f"data: {json.dumps(completion)}\n\n"

@app.post("/stream-text")
async def stream_text(request: StreamRequest):
    """
    Stream AI-generated text in real-time (ChatGPT-style).
    
    This endpoint demonstrates LLM streaming:
    - Send prompt
    - Receive words one-by-one
    - See text appear in real-time
    
    Try prompts with: 'joke', 'story', or 'poem'
    """
    logger.info(f"Text generation requested: {request.prompt[:50]}...")
    
    return StreamingResponse(
        generate_text_stream(request.prompt, request.max_tokens),
        media_type="text/event-stream",
        headers={
            "Cache-Control": "no-cache",
            "Connection": "keep-alive",
            "X-Accel-Buffering": "no"  # Disable nginx buffering
        }
    )

print("‚úÖ LLM-style streaming endpoint added!")
print("\nüìã Endpoint: POST /stream-text")
print("   ‚Ä¢ Accepts prompt and max_tokens")
print("   ‚Ä¢ Streams word-by-word")
print("   ‚Ä¢ Returns JSON messages")
print("\nüí° Try prompts with: 'joke', 'story', 'poem'")

### üéØ Understanding LLM Streaming:

**Message Types:**

```json
// 1. Start message
{
    "type": "start",
    "prompt": "Tell me a joke"
}

// 2. Token messages (one per word)
{
    "type": "token",
    "content": "Why ",
    "index": 0
}

// 3. Completion message
{
    "type": "done",
    "total_tokens": 15,
    "finish_reason": "completed"
}
```

**Why JSON in SSE?**
- ‚úÖ Structured data
- ‚úÖ Easy to parse on client
- ‚úÖ Can include metadata (token count, etc.)
- ‚úÖ Matches real LLM API formats (OpenAI, Anthropic)

**Production Integration:**
```python
# OpenAI example (actual code)
async def openai_stream(prompt):
    response = await openai.ChatCompletion.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}],
        stream=True  # Enable streaming!
    )
    
    async for chunk in response:
        if chunk.choices[0].delta.content:
            yield f"data: {chunk.choices[0].delta.content}\n\n"
```

---

## üîí Part 5: Environment Variables & Security

**Never hardcode secrets in your code!**

### ‚ùå BAD Practice:
```python
API_KEY = "sk-12345abcdef"  # NEVER DO THIS!
DATABASE_URL = "postgresql://user:pass@localhost/db"
```

**Problems:**
- Gets committed to Git (public!)
- Anyone with code access sees secrets
- Can't change without code update
- Security nightmare!

### ‚úÖ GOOD Practice:
```python
import os
API_KEY = os.getenv("API_KEY")  # Secure!
DATABASE_URL = os.getenv("DATABASE_URL")
```

**Benefits:**
- ‚úÖ Secrets stay out of code
- ‚úÖ Different values per environment (dev/prod)
- ‚úÖ Easy to rotate secrets
- ‚úÖ Works with deployment platforms

### üí° Using .env Files:

**.env file (never commit!):**
```
API_KEY=your-secret-key-here
DATABASE_URL=postgresql://...
DEBUG=True
```

**Load in code:**
```python
from dotenv import load_dotenv
load_dotenv()  # Loads .env automatically!
```

In [None]:
# Create example .env file
print("üîí Setting up environment variables...\n")

# Create .env file
env_content = """# API Configuration
API_KEY=demo-key-12345
API_SECRET=demo-secret-67890

# Database
DATABASE_URL=sqlite:///./test.db

# Application Settings
DEBUG=True
MAX_REQUESTS_PER_MINUTE=60
LOG_LEVEL=INFO

# External Services
OPENAI_API_KEY=your-openai-key-here
STRIPE_SECRET_KEY=your-stripe-key-here
"""

with open('.env', 'w') as f:
    f.write(env_content)

print("‚úÖ .env file created!")
print("\nüìÑ Contents:")
print(env_content)
print("\n‚ö†Ô∏è IMPORTANT:")
print("   ‚Ä¢ Add .env to .gitignore")
print("   ‚Ä¢ Never commit .env to Git")
print("   ‚Ä¢ Use .env.example for documentation")

In [None]:
# Load and use environment variables
print("üì• Loading environment variables...\n")

# Load .env file
load_dotenv()

# Access environment variables
API_KEY = os.getenv("API_KEY")
API_SECRET = os.getenv("API_SECRET")
DEBUG = os.getenv("DEBUG", "False") == "True"  # Convert string to boolean
MAX_REQUESTS = int(os.getenv("MAX_REQUESTS_PER_MINUTE", "60"))

print("‚úÖ Environment variables loaded!")
print("\nüîë Configuration:")
print(f"   ‚Ä¢ API_KEY: {API_KEY[:8]}*** (hidden)")
print(f"   ‚Ä¢ DEBUG: {DEBUG}")
print(f"   ‚Ä¢ MAX_REQUESTS: {MAX_REQUESTS}")
print("\nüí° In production:")
print("   ‚Ä¢ Set env vars in hosting platform")
print("   ‚Ä¢ Use secrets management (AWS Secrets Manager, etc.)")
print("   ‚Ä¢ Never print actual values in logs!")

In [None]:
# Add endpoint that uses environment variables
print("üîê Adding secure endpoint with API key validation...\n")

@app.get("/secure-endpoint")
async def secure_endpoint(api_key: str):
    """
    Secure endpoint that requires API key.
    
    In production:
    - Use proper authentication (OAuth, JWT)
    - Store keys in database
    - Add rate limiting
    - Use HTTPS only
    
    Query parameter:
    - api_key: Your API key from .env file
    """
    # Validate API key
    if api_key != API_KEY:
        logger.warning(f"Invalid API key attempt: {api_key[:8]}***")
        raise HTTPException(
            status_code=401,  # Unauthorized
            detail="Invalid API key"
        )
    
    logger.info("Secure endpoint accessed successfully")
    
    return {
        "message": "Access granted!",
        "user": "authenticated",
        "timestamp": datetime.now().isoformat()
    }

@app.get("/config")
async def get_config():
    """
    Get non-sensitive configuration.
    
    NEVER expose secrets in public endpoints!
    """
    return {
        "debug_mode": DEBUG,
        "max_requests_per_minute": MAX_REQUESTS,
        "log_level": os.getenv("LOG_LEVEL", "INFO"),
        "api_version": "3.0.0"
    }

print("‚úÖ Secure endpoints added!")
print("\nüìã Endpoints:")
print("   ‚Ä¢ GET /secure-endpoint?api_key=XXX")
print("   ‚Ä¢ GET /config (public configuration)")
print("\nüí° Try:")
print(f"   ‚Ä¢ Valid key: {API_KEY}")
print("   ‚Ä¢ Invalid key: wrong-key (will fail!)")

---

## ‚ö° Part 6: Rate Limiting (Prevent Abuse)

**Why rate limiting?**
- Prevent abuse (DDoS attacks)
- Fair usage across users
- Protect server resources
- Comply with upstream API limits

**Example scenario:**
```
Without rate limiting:
Attacker sends 10,000 requests/second ‚Üí Server crashes üí•

With rate limiting:
Allow 60 requests/minute per IP ‚Üí Attacker blocked ‚úÖ
Normal users ‚Üí No impact üòä
```

In [None]:
# Setup rate limiting
print("‚ö° Configuring rate limiting...\n")

# Initialize limiter
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

@app.get("/limited")
@limiter.limit("5/minute")  # 5 requests per minute
async def limited_endpoint(request: Request):
    """
    Rate-limited endpoint - max 5 requests per minute.
    
    Try calling this multiple times quickly!
    After 5 requests, you'll get a 429 error.
    
    Rate limit resets after 1 minute.
    """
    return {
        "message": "Success!",
        "note": "This endpoint is rate-limited to 5 requests/minute",
        "requests_remaining": "Check X-RateLimit-Remaining header"
    }

@app.post("/stream-text-limited")
@limiter.limit("10/minute")  # More generous for streaming
async def stream_text_limited(request: Request, stream_req: StreamRequest):
    """
    Rate-limited streaming endpoint.
    Max 10 requests per minute.
    """
    logger.info(f"Rate-limited stream requested: {stream_req.prompt[:30]}...")
    
    return StreamingResponse(
        generate_text_stream(stream_req.prompt, stream_req.max_tokens),
        media_type="text/event-stream"
    )

print("‚úÖ Rate limiting configured!")
print("\n‚ö° Limits:")
print("   ‚Ä¢ /limited: 5 requests/minute")
print("   ‚Ä¢ /stream-text-limited: 10 requests/minute")
print("\nüí° Response headers:")
print("   ‚Ä¢ X-RateLimit-Limit: Total allowed")
print("   ‚Ä¢ X-RateLimit-Remaining: Requests left")
print("   ‚Ä¢ X-RateLimit-Reset: When limit resets")

---

## üìö Part 7: Start Server & Test Everything

Let's run our advanced API and test all features!

In [None]:
# Helper function to run server
def run_server(app, port=8000):
    def start_server():
        uvicorn.run(app, host="127.0.0.1", port=port, log_level="info")
    
    thread = Thread(target=start_server, daemon=True)
    thread.start()
    time.sleep(3)
    
    print(f"‚úÖ Server started!")
    print(f"\nüåê Advanced FastAPI running at:")
    print(f"   ‚Ä¢ Main URL: http://127.0.0.1:{port}")
    print(f"   ‚Ä¢ Swagger UI: http://127.0.0.1:{port}/docs")
    print(f"\nüî• Try these endpoints:")
    print(f"   ‚Ä¢ Streaming: http://127.0.0.1:{port}/stream-simple")
    print(f"   ‚Ä¢ LLM Stream: POST to /stream-text")
    print(f"   ‚Ä¢ Secure: /secure-endpoint?api_key={API_KEY}")
    print(f"   ‚Ä¢ Rate Limited: /limited (try 6 times!)")
    
    return thread

# Start server
print("üöÄ Starting Advanced FastAPI Server...\n")
server_thread = run_server(app, port=8003)

### üß™ Testing in Browser:

**1. Test Simple Streaming:**
- Open: http://127.0.0.1:8003/stream-simple
- Watch numbers appear one by one!

**2. Test in Swagger UI:**
- Open: http://127.0.0.1:8003/docs
- Try POST /stream-text:
  ```json
  {
    "prompt": "Tell me a joke",
    "max_tokens": 50
  }
  ```
- Watch the streaming response!

**3. Test Security:**
- Try /secure-endpoint with correct API key
- Try with wrong API key (should fail!)

**4. Test Rate Limiting:**
- Call /limited 6 times quickly
- 6th request should return 429 error

**üí° Pro Tip:** Open browser DevTools (F12) ‚Üí Network tab to see SSE messages!

In [None]:
# Test endpoints programmatically
print("üß™ Testing Advanced API Features...\n")
print("=" * 70)

# Test 1: Configuration endpoint
print("\n1Ô∏è‚É£ Testing Configuration Endpoint")
print("-" * 70)
response = requests.get("http://127.0.0.1:8003/config")
config = response.json()
print("‚úÖ Configuration:")
for key, value in config.items():
    print(f"   ‚Ä¢ {key}: {value}")

# Test 2: Secure endpoint with valid key
print("\n2Ô∏è‚É£ Testing Secure Endpoint (Valid Key)")
print("-" * 70)
response = requests.get(
    f"http://127.0.0.1:8003/secure-endpoint?api_key={API_KEY}"
)
if response.status_code == 200:
    print("‚úÖ Authentication successful!")
    print(f"   Response: {response.json()}")

# Test 3: Secure endpoint with invalid key
print("\n3Ô∏è‚É£ Testing Secure Endpoint (Invalid Key)")
print("-" * 70)
response = requests.get(
    "http://127.0.0.1:8003/secure-endpoint?api_key=wrong-key"
)
if response.status_code == 401:
    print("‚úÖ Authentication failed as expected!")
    print(f"   Error: {response.json()['detail']}")

# Test 4: Rate limiting
print("\n4Ô∏è‚É£ Testing Rate Limiting")
print("-" * 70)
print("Sending 6 requests to /limited (limit is 5/minute)...")
for i in range(1, 7):
    response = requests.get("http://127.0.0.1:8003/limited")
    if response.status_code == 200:
        print(f"   ‚úÖ Request {i}: Success")
    elif response.status_code == 429:
        print(f"   üö´ Request {i}: Rate limited!")
        print(f"      Error: {response.json()['detail']}")
    time.sleep(0.5)

print("\n" + "=" * 70)
print("\n‚úÖ All tests completed!")

---

## üéØ Part 8: Production Best Practices Checklist

Before deploying to production, ensure you've done all of these!

In [None]:
# Production checklist generator
print("üìã Generating Production Deployment Checklist...\n")

checklist = """
SECURITY CHECKLIST
===============================================================
[ ] Environment variables used for all secrets
[ ] .env file in .gitignore
[ ] HTTPS enabled (TLS certificates)
[ ] CORS configured with specific origins (not "*")
[ ] API authentication implemented (API keys/OAuth)
[ ] Rate limiting configured
[ ] Input validation with Pydantic
[ ] SQL injection prevention (use ORMs)
[ ] Error messages don't expose internals
[ ] Security headers configured

MONITORING & LOGGING
===============================================================
[ ] Structured logging implemented
[ ] Request/response logging
[ ] Error tracking (Sentry, etc.)
[ ] Performance monitoring (latency, throughput)
[ ] Resource monitoring (CPU, memory)
[ ] Alert system for critical errors
[ ] Model performance tracking
[ ] API usage analytics

PERFORMANCE
===============================================================
[ ] Models loaded at startup (singleton pattern)
[ ] Database connection pooling
[ ] Caching implemented (Redis/in-memory)
[ ] Async operations for I/O
[ ] Response compression (gzip)
[ ] CDN for static assets
[ ] Load balancing configured
[ ] Auto-scaling setup

TESTING
===============================================================
[ ] Unit tests written (>80% coverage)
[ ] Integration tests for all endpoints
[ ] Load testing performed
[ ] Security testing done
[ ] Error handling tested
[ ] Edge cases covered

DOCUMENTATION
===============================================================
[ ] README with setup instructions
[ ] API documentation complete (Swagger)
[ ] Code comments for complex logic
[ ] Architecture diagram
[ ] Deployment guide
[ ] Troubleshooting guide
[ ] Example requests/responses
[ ] Changelog maintained

DEPLOYMENT
===============================================================
[ ] Docker container created
[ ] CI/CD pipeline configured
[ ] Automated testing in pipeline
[ ] Staging environment setup
[ ] Rollback strategy defined
[ ] Health check endpoint
[ ] Graceful shutdown handling
[ ] Environment-specific configs

MAINTENANCE
===============================================================
[ ] Dependency updates scheduled
[ ] Security patches process
[ ] Backup strategy implemented
[ ] Disaster recovery plan
[ ] On-call rotation defined
[ ] Incident response procedures

COMMUNICATION
===============================================================
[ ] Status page for API
[ ] Change log published
[ ] User notification system
[ ] Support channel established
[ ] SLA defined and documented
"""

print(checklist)

# Save to file with UTF-8 encoding (explicitly)
with open('production_checklist.txt', 'w', encoding='utf-8') as f:
    f.write(checklist)

print("\n‚úÖ Checklist saved to: production_checklist.txt")
print("\nüí° Review this before every production deployment!")

---

## üéØ Part 9: Beginner Challenge

### üèÜ Your Mission:

Enhance the API with production-ready features!

### üìã Requirements:

**1. Add Health Check Endpoint**
- Create `/health` endpoint
- Check: Server status, model loaded, database connection
- Return: Status, uptime, version

**2. Add Request Logging Middleware**
- Log all incoming requests
- Include: Method, path, IP, timestamp
- Log response time

**3. Add Custom Error Handler**
- Catch all exceptions
- Return consistent error format
- Log errors with full stack trace

### üí° Hints:

```python
# Hint 1: Health check
@app.get("/health")
async def health_check():
    return {
        "status": "healthy",
        "uptime": calculate_uptime(),
        "version": "3.0.0"
    }

# Hint 2: Request logging middleware
@app.middleware("http")
async def log_requests(request: Request, call_next):
    start_time = time.time()
    response = await call_next(request)
    duration = time.time() - start_time
    logger.info(f"{request.method} {request.url.path} - {duration:.2f}s")
    return response

# Hint 3: Error handler
@app.exception_handler(Exception)
async def global_exception_handler(request: Request, exc: Exception):
    logger.error(f"Error: {exc}")
    return JSONResponse(
        status_code=500,
        content={"error": "Internal server error"}
    )
```

### üéØ Expected Outcome:

- Health check at `/health` shows system status
- All requests logged with timing
- Errors handled gracefully

### üåü Bonus Challenges:

1. **Add metrics endpoint** `/metrics`:
   - Total requests
   - Average response time
   - Error rate

2. **Add API versioning**:
   - `/v1/predict` and `/v2/predict`
   - Different logic per version

3. **Add request ID tracking**:
   - Generate unique ID per request
   - Include in logs
   - Return in response headers

In [None]:
# Your code here!
# Implement the challenge requirements

# Step 1: Add health check endpoint

# Step 2: Add logging middleware

# Step 3: Add error handler

# Restart server and test!

pass

---

## üìö Summary - What We Learned Today

### 1. Server-Sent Events (SSE) üì°
- **Streaming responses** - Send data incrementally
- **SSE format** - `"data: <content>\\n\\n"`
- **StreamingResponse** - FastAPI's streaming class
- **Async generators** - Use `yield` with `async def`
- **Perfect for AI** - ChatGPT-style text generation

### 2. Environment Variables üîí
- **Never hardcode secrets** - Use `os.getenv()`
- **.env files** - Store configuration securely
- **python-dotenv** - Load .env automatically
- **.gitignore** - Never commit secrets!
- **Different per environment** - Dev vs Production

### 3. Rate Limiting ‚ö°
- **Prevent abuse** - Limit requests per user/IP
- **slowapi library** - Easy rate limiting
- **Decorator pattern** - `@limiter.limit("5/minute")`
- **Response headers** - X-RateLimit-* headers
- **Fair usage** - Protect resources

### 4. API Documentation üìö
- **Custom API info** - Title, description, version
- **Endpoint documentation** - Summaries and descriptions
- **Response models** - Pydantic for structure
- **Example data** - Helps users understand
- **Automatic Swagger** - All from code!

### 5. Security Best Practices üîê
- **Authentication** - API keys, OAuth
- **HTTPS only** - Encrypt all traffic
- **CORS properly** - Specific origins in production
- **Input validation** - Never trust user input
- **Error handling** - Don't expose internals

### 6. Logging & Monitoring üìä
- **Structured logging** - Timestamp, level, message
- **Request logging** - Track all API calls
- **Error logging** - Debug production issues
- **Performance metrics** - Response times, errors

### 7. Production Readiness üöÄ
- **Comprehensive checklist** - 50+ items to verify
- **Testing** - Unit, integration, load tests
- **Documentation** - README, API docs, guides
- **Deployment** - Docker, CI/CD, monitoring
- **Maintenance** - Updates, backups, incidents

---

## üéØ Key Takeaways

‚úÖ **Streaming enhances user experience**
- Real-time feedback vs waiting
- Essential for AI applications

‚úÖ **Security is not optional**
- Environment variables for secrets
- Rate limiting prevents abuse
- HTTPS is mandatory in production

‚úÖ **Good logging saves hours of debugging**
- Log everything important
- Include context (request IDs, etc.)
- Monitor in production

‚úÖ **Documentation is for everyone**
- Future you
- Team members
- API consumers

‚úÖ **Production is different from development**
- Different secrets
- Stricter security
- More monitoring
- Better error handling

‚úÖ **Checklists prevent mistakes**
- Review before every deployment
- Don't skip items
- Add project-specific checks

---

## üí° Pro Tips for Production

1. **Test in Staging First**
   - Identical to production
   - Catch issues before users do

2. **Monitor Everything**
   - Logs, metrics, alerts
   - Know issues before users report them

3. **Have a Rollback Plan**
   - Things will go wrong
   - Quick rollback saves users

4. **Document Everything**
   - Architecture decisions
   - Deployment procedures
   - Troubleshooting guides

5. **Security is Ongoing**
   - Regular dependency updates
   - Security audits
   - Penetration testing

6. **Performance Matters**
   - Users expect fast responses
   - Monitor and optimize
   - Load test before launch

---

## üéâ Week 17 Complete!

### What You've Accomplished:

**Day 1: FastAPI Fundamentals ‚úÖ**
- Built Calculator API
- Learned REST principles
- Mastered Swagger UI

**Day 2: ML Model APIs ‚úÖ**
- Deployed sentiment analysis model
- Handled file uploads
- Enabled CORS

**Day 3: Production Ready ‚úÖ**
- Implemented streaming (SSE)
- Secured with environment variables
- Added rate limiting
- Production checklist

### üèÜ You Can Now:
- ‚úÖ Build complete REST APIs with FastAPI
- ‚úÖ Deploy ML models as endpoints
- ‚úÖ Stream responses in real-time
- ‚úÖ Secure APIs properly
- ‚úÖ Handle production workloads
- ‚úÖ Monitor and debug effectively
- ‚úÖ Deploy to production with confidence

**You're now a FastAPI expert! üöÄ**

---

## üöÄ Next Steps

**Week 18: Docker & CI/CD**
- Containerize your APIs
- Automated testing
- Continuous deployment

**Week 19: MLOps & Automation**
- Model versioning
- Automated retraining
- Monitoring ML performance

**Week 20: AWS Deployment**
- Deploy to cloud
- Auto-scaling
- Production architecture

---

## üéä Congratulations!

You've completed Week 17 - API Development with FastAPI!

**From zero to production in 3 days:**
- Day 1: Learned the basics
- Day 2: Added ML models
- Day 3: Made it production-ready

**This knowledge is immediately applicable:**
- Deploy your bootcamp projects as APIs
- Build portfolio projects
- Prepare for job interviews
- Freelance API development

**Keep practicing and building! üí™**

**See you in Week 18! üöÄ**