# Module 04: Creating ML APIs with FastAPI

**Difficulty**: ⭐⭐ Intermediate  
**Estimated Time**: 60 minutes  
**Prerequisites**: 
- Module 03: Model Serialization
- Basic understanding of HTTP and REST APIs
- Familiarity with Python web frameworks (helpful but not required)

## Learning Objectives

By the end of this notebook, you will be able to:
1. Build REST APIs for ML model serving using FastAPI
2. Implement request validation with Pydantic models
3. Handle errors gracefully with proper HTTP status codes
4. Generate automatic API documentation with OpenAPI/Swagger
5. Add health checks and monitoring endpoints
6. Test API endpoints programmatically

## 1. Why FastAPI for ML Serving?

**FastAPI** is a modern, fast web framework for building APIs with Python 3.7+.

### Advantages for ML APIs:
- ✅ **Fast**: High performance (comparable to NodeJS and Go)
- ✅ **Type validation**: Automatic request/response validation
- ✅ **Auto documentation**: Interactive API docs (Swagger UI)
- ✅ **Async support**: Handle concurrent requests efficiently
- ✅ **Easy to learn**: Intuitive syntax, great for data scientists

### Comparison with Alternatives:

| Framework | Speed | Validation | Docs | Learning Curve |
|-----------|-------|------------|------|----------------|
| **FastAPI** | ⭐⭐⭐⭐⭐ | Automatic | Automatic | Easy |
| **Flask** | ⭐⭐⭐ | Manual | Manual | Easy |
| **Django** | ⭐⭐ | Good | Good | Steep |
| **Tornado** | ⭐⭐⭐⭐ | Manual | Manual | Moderate |

In [None]:
# Setup: Install FastAPI and dependencies
# Uncomment to install:
# !pip install fastapi uvicorn pydantic scikit-learn joblib

import warnings
warnings.filterwarnings('ignore')

# Check if FastAPI is installed
try:
    import fastapi
    import uvicorn
    from pydantic import BaseModel, Field, ValidationError
    print("✓ FastAPI and dependencies installed")
    print(f"  FastAPI version: {fastapi.__version__}")
    fastapi_available = True
except ImportError as e:
    print(f"⚠ Missing dependencies: {e}")
    print("  Install with: pip install fastapi uvicorn pydantic")
    fastapi_available = False

In [None]:
# Import other required libraries
import numpy as np
import pandas as pd
import joblib
from pathlib import Path
import json
from typing import List, Optional, Dict, Any
from datetime import datetime
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

np.random.seed(42)
print("✓ Libraries imported successfully")

## 2. Preparing a Model for API Deployment

First, let's train and save a model that we'll serve via API.

In [None]:
# Create and train a model
X, y = make_classification(
    n_samples=2000,
    n_features=10,
    n_informative=8,
    n_redundant=2,
    n_classes=2,
    random_state=42
)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Create pipeline with preprocessing
model_pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('classifier', RandomForestClassifier(n_estimators=100, random_state=42))
])

# Train model
model_pipeline.fit(X_train, y_train)

# Save model
model_dir = Path('api_models')
model_dir.mkdir(exist_ok=True)
model_path = model_dir / 'fraud_detector.joblib'

joblib.dump(model_pipeline, model_path)

print(f"✓ Model trained and saved to: {model_path}")
print(f"✓ Test accuracy: {model_pipeline.score(X_test, y_test):.4f}")
print(f"✓ Number of features: {X.shape[1]}")

In [None]:
# Save model metadata
metadata = {
    'model_name': 'fraud_detector',
    'version': '1.0.0',
    'n_features': X.shape[1],
    'feature_names': [f'feature_{i}' for i in range(X.shape[1])],
    'classes': ['legitimate', 'fraud'],
    'created_at': datetime.now().isoformat(),
    'description': 'Binary classifier for fraud detection'
}

metadata_path = model_dir / 'fraud_detector_metadata.json'
with open(metadata_path, 'w') as f:
    json.dump(metadata, f, indent=2)

print(f"✓ Metadata saved to: {metadata_path}")

## 3. Building Your First ML API with FastAPI

Let's create a simple prediction API. In a real deployment, this code would be in a separate Python file (e.g., `main.py`).

In [None]:
if fastapi_available:
    from fastapi import FastAPI, HTTPException
    from pydantic import BaseModel, Field
    
    # Create FastAPI app
    app = FastAPI(
        title="Fraud Detection API",
        description="API for detecting fraudulent transactions using ML",
        version="1.0.0"
    )
    
    # Load model at startup
    model = joblib.load(model_path)
    with open(metadata_path, 'r') as f:
        model_metadata = json.load(f)
    
    print("✓ FastAPI app created")
    print(f"✓ Model loaded: {model_metadata['model_name']} v{model_metadata['version']}")
else:
    print("Skipping FastAPI demo (not installed)")

## 4. Defining Request and Response Models with Pydantic

**Pydantic** provides automatic validation and serialization for API requests/responses.

In [None]:
if fastapi_available:
    # Define request model (input validation)
    class PredictionRequest(BaseModel):
        """Request model for fraud prediction."""
        
        features: List[float] = Field(
            ...,  # Required field
            description="List of feature values",
            min_items=10,
            max_items=10,
            example=[0.5, 1.2, -0.3, 0.8, 1.5, -0.2, 0.9, 1.1, -0.5, 0.7]
        )
        
        class Config:
            schema_extra = {
                "example": {
                    "features": [0.5, 1.2, -0.3, 0.8, 1.5, -0.2, 0.9, 1.1, -0.5, 0.7]
                }
            }
    
    # Define response model (output validation)
    class PredictionResponse(BaseModel):
        """Response model for fraud prediction."""
        
        prediction: str = Field(
            ...,
            description="Predicted class (legitimate or fraud)"
        )
        confidence: float = Field(
            ...,
            description="Prediction confidence (0-1)",
            ge=0.0,
            le=1.0
        )
        probabilities: Dict[str, float] = Field(
            ...,
            description="Class probabilities"
        )
        model_version: str = Field(
            ...,
            description="Model version used for prediction"
        )
    
    print("✓ Pydantic models defined")
    print("  - PredictionRequest: validates input features")
    print("  - PredictionResponse: validates output format")

## 5. Creating API Endpoints

### Essential Endpoints for ML APIs:
1. **Health Check** (`/health`): Service status
2. **Model Info** (`/model/info`): Model metadata
3. **Prediction** (`/predict`): Make predictions
4. **Batch Prediction** (`/predict/batch`): Multiple predictions

In [None]:
if fastapi_available:
    # 1. Health check endpoint
    @app.get("/health")
    def health_check():
        """Check if API is running and model is loaded."""
        return {
            "status": "healthy",
            "timestamp": datetime.now().isoformat(),
            "model_loaded": model is not None
        }
    
    # 2. Model information endpoint
    @app.get("/model/info")
    def get_model_info():
        """Get information about the loaded model."""
        return model_metadata
    
    print("✓ Endpoints defined:")
    print("  GET  /health - Health check")
    print("  GET  /model/info - Model information")

In [None]:
if fastapi_available:
    # 3. Prediction endpoint
    @app.post("/predict", response_model=PredictionResponse)
    def predict(request: PredictionRequest):
        """
        Make a fraud prediction for a single transaction.
        
        - **features**: List of 10 numeric feature values
        
        Returns prediction, confidence, and class probabilities.
        """
        try:
            # Convert features to numpy array
            features_array = np.array(request.features).reshape(1, -1)
            
            # Make prediction
            prediction = model.predict(features_array)[0]
            probabilities = model.predict_proba(features_array)[0]
            
            # Get class name
            class_name = model_metadata['classes'][prediction]
            
            # Get confidence (max probability)
            confidence = float(max(probabilities))
            
            # Create response
            return PredictionResponse(
                prediction=class_name,
                confidence=confidence,
                probabilities={
                    model_metadata['classes'][0]: float(probabilities[0]),
                    model_metadata['classes'][1]: float(probabilities[1])
                },
                model_version=model_metadata['version']
            )
            
        except Exception as e:
            raise HTTPException(
                status_code=500,
                detail=f"Prediction error: {str(e)}"
            )
    
    print("✓ Prediction endpoint defined:")
    print("  POST /predict - Single prediction")

In [None]:
if fastapi_available:
    # 4. Batch prediction endpoint
    class BatchPredictionRequest(BaseModel):
        """Request model for batch predictions."""
        transactions: List[List[float]] = Field(
            ...,
            description="List of transactions, each with 10 features"
        )
    
    @app.post("/predict/batch")
    def predict_batch(request: BatchPredictionRequest):
        """
        Make fraud predictions for multiple transactions.
        
        Returns a list of predictions.
        """
        try:
            # Validate feature count for each transaction
            for i, transaction in enumerate(request.transactions):
                if len(transaction) != model_metadata['n_features']:
                    raise HTTPException(
                        status_code=400,
                        detail=f"Transaction {i}: expected {model_metadata['n_features']} features, "
                               f"got {len(transaction)}"
                    )
            
            # Convert to numpy array
            features_array = np.array(request.transactions)
            
            # Make predictions
            predictions = model.predict(features_array)
            probabilities = model.predict_proba(features_array)
            
            # Format results
            results = []
            for i, (pred, probs) in enumerate(zip(predictions, probabilities)):
                results.append({
                    "transaction_id": i,
                    "prediction": model_metadata['classes'][pred],
                    "confidence": float(max(probs)),
                    "probabilities": {
                        model_metadata['classes'][0]: float(probs[0]),
                        model_metadata['classes'][1]: float(probs[1])
                    }
                })
            
            return {
                "model_version": model_metadata['version'],
                "n_transactions": len(results),
                "predictions": results
            }
            
        except HTTPException:
            raise
        except Exception as e:
            raise HTTPException(
                status_code=500,
                detail=f"Batch prediction error: {str(e)}"
            )
    
    print("✓ Batch prediction endpoint defined:")
    print("  POST /predict/batch - Batch predictions")

## 6. Testing API Endpoints

We can test our API using FastAPI's `TestClient`.

In [None]:
if fastapi_available:
    from fastapi.testclient import TestClient
    
    # Create test client
    client = TestClient(app)
    
    print("✓ Test client created")
    print("\nTesting API endpoints...\n")

In [None]:
if fastapi_available:
    # Test 1: Health check
    response = client.get("/health")
    print("Test 1: Health Check")
    print("="*60)
    print(f"Status Code: {response.status_code}")
    print(f"Response: {json.dumps(response.json(), indent=2)}")
    print()

In [None]:
if fastapi_available:
    # Test 2: Model info
    response = client.get("/model/info")
    print("Test 2: Model Info")
    print("="*60)
    print(f"Status Code: {response.status_code}")
    print(f"Response: {json.dumps(response.json(), indent=2)}")
    print()

In [None]:
if fastapi_available:
    # Test 3: Single prediction
    test_features = X_test[0].tolist()
    
    response = client.post(
        "/predict",
        json={"features": test_features}
    )
    
    print("Test 3: Single Prediction")
    print("="*60)
    print(f"Status Code: {response.status_code}")
    print(f"Input features: {test_features}")
    print(f"Response: {json.dumps(response.json(), indent=2)}")
    print()

In [None]:
if fastapi_available:
    # Test 4: Batch prediction
    test_batch = X_test[:5].tolist()
    
    response = client.post(
        "/predict/batch",
        json={"transactions": test_batch}
    )
    
    print("Test 4: Batch Prediction")
    print("="*60)
    print(f"Status Code: {response.status_code}")
    print(f"Number of transactions: {len(test_batch)}")
    result = response.json()
    print(f"Model version: {result['model_version']}")
    print(f"\nFirst 3 predictions:")
    for pred in result['predictions'][:3]:
        print(f"  Transaction {pred['transaction_id']}: {pred['prediction']} "
              f"(confidence: {pred['confidence']:.2f})")
    print()

## 7. Error Handling and Validation

FastAPI provides automatic validation and helpful error messages.

In [None]:
if fastapi_available:
    # Test invalid input (wrong number of features)
    print("Test: Invalid Input - Wrong Number of Features")
    print("="*60)
    
    response = client.post(
        "/predict",
        json={"features": [1.0, 2.0, 3.0]}  # Only 3 features instead of 10
    )
    
    print(f"Status Code: {response.status_code}")
    print(f"Error Response: {json.dumps(response.json(), indent=2)}")
    print("\n✓ FastAPI automatically validates input and returns helpful errors")
    print()

In [None]:
if fastapi_available:
    # Test invalid input (wrong data type)
    print("Test: Invalid Input - Wrong Data Type")
    print("="*60)
    
    response = client.post(
        "/predict",
        json={"features": "not a list"}  # String instead of list
    )
    
    print(f"Status Code: {response.status_code}")
    print(f"Error Response: {json.dumps(response.json(), indent=2)}")
    print()

## 8. Creating Complete API File

Let's create a production-ready API file that can be deployed.

In [None]:
# Create a complete API file for deployment
api_code = '''
"""\nFraud Detection API\n\nA production-ready FastAPI application for fraud detection.\n"""\n\nfrom fastapi import FastAPI, HTTPException\nfrom pydantic import BaseModel, Field\nfrom typing import List, Dict\nimport joblib\nimport numpy as np\nimport json\nfrom pathlib import Path\nfrom datetime import datetime\n\n# Initialize FastAPI app\napp = FastAPI(\n    title="Fraud Detection API",\n    description="ML-powered fraud detection for transactions",\n    version="1.0.0",\n    docs_url="/docs",\n    redoc_url="/redoc"\n)\n\n# Load model at startup\nMODEL_PATH = Path("api_models/fraud_detector.joblib")\nMETADATA_PATH = Path("api_models/fraud_detector_metadata.json")\n\ntry:\n    model = joblib.load(MODEL_PATH)\n    with open(METADATA_PATH, "r") as f:\n        model_metadata = json.load(f)\n    print(f"✓ Model loaded: {model_metadata['model_name']} v{model_metadata['version']}")\nexcept Exception as e:\n    print(f"✗ Error loading model: {e}")\n    model = None\n    model_metadata = {}\n\n# Request/Response models\nclass PredictionRequest(BaseModel):\n    features: List[float] = Field(\n        ...,\n        min_items=10,\n        max_items=10,\n        description="Transaction features"\n    )\n\nclass PredictionResponse(BaseModel):\n    prediction: str\n    confidence: float = Field(ge=0.0, le=1.0)\n    probabilities: Dict[str, float]\n    model_version: str\n\n# Endpoints\n@app.get("/")\ndef root():\n    return {\n        "message": "Fraud Detection API",\n        "version": "1.0.0",\n        "docs": "/docs"\n    }\n\n@app.get("/health")\ndef health_check():\n    return {\n        "status": "healthy" if model is not None else "unhealthy",\n        "timestamp": datetime.now().isoformat(),\n        "model_loaded": model is not None\n    }\n\n@app.get("/model/info")\ndef get_model_info():\n    if not model:\n        raise HTTPException(status_code=503, detail="Model not loaded")\n    return model_metadata\n\n@app.post("/predict", response_model=PredictionResponse)\ndef predict(request: PredictionRequest):\n    if not model:\n        raise HTTPException(status_code=503, detail="Model not loaded")\n    \n    try:\n        features = np.array(request.features).reshape(1, -1)\n        prediction = model.predict(features)[0]\n        probabilities = model.predict_proba(features)[0]\n        \n        return PredictionResponse(\n            prediction=model_metadata[\'classes\'][prediction],\n            confidence=float(max(probabilities)),\n            probabilities={\n                model_metadata[\'classes\'][0]: float(probabilities[0]),\n                model_metadata[\'classes\'][1]: float(probabilities[1])\n            },\n            model_version=model_metadata[\'version\']\n        )\n    except Exception as e:\n        raise HTTPException(status_code=500, detail=str(e))\n\nif __name__ == "__main__":\n    import uvicorn\n    uvicorn.run(app, host="0.0.0.0", port=8000)\n'''

# Save API file
api_file = Path('main.py')
with open(api_file, 'w') as f:
    f.write(api_code)

print(f"✓ Production API file created: {api_file}")
print("\nTo run the API:")
print("  uvicorn main:app --reload")
print("\nThen visit:")
print("  http://localhost:8000/docs - Interactive API documentation")
print("  http://localhost:8000/redoc - Alternative documentation")

## 9. Exercises

### Exercise 1: Add Logging to API

Enhance the API with comprehensive logging.

**Requirements:**
1. Add logging for all requests and responses
2. Log prediction inputs and outputs
3. Log errors with stack traces
4. Create a `/logs` endpoint to view recent logs

**Bonus**: Save logs to a file with rotation

In [None]:
# Your solution here
import logging

# TODO: Implement logging
# 1. Configure logger
# 2. Add logging to endpoints
# 3. Create logs endpoint

### Exercise 2: Add Rate Limiting

Implement rate limiting to prevent API abuse.

**Requirements:**
1. Limit requests to 100 per hour per IP address
2. Return HTTP 429 (Too Many Requests) when limit exceeded
3. Include rate limit info in response headers
4. Create a `/rate-limit/status` endpoint to check current usage

**Hint**: Use `slowapi` library or implement custom middleware

In [None]:
# Your solution here

# TODO: Implement rate limiting
# 1. Track requests per IP
# 2. Add middleware
# 3. Return 429 when exceeded

### Exercise 3: Create Model Comparison Endpoint

Build an endpoint that compares predictions from multiple model versions.

**Requirements:**
1. Load two different model versions
2. Create a `/compare` endpoint that:
   - Accepts the same input
   - Returns predictions from both models
   - Shows agreement/disagreement
3. Add visualization of prediction differences

**Bonus**: Add metrics on how often models agree/disagree

In [None]:
# Your solution here

# TODO: Implement model comparison
# 1. Load multiple models
# 2. Create comparison endpoint
# 3. Compare predictions

## 10. Summary

### Key Concepts Covered

1. **FastAPI Basics**: Created a modern ML API with automatic documentation
2. **Request Validation**: Used Pydantic models for type checking and validation
3. **Error Handling**: Implemented proper HTTP status codes and error messages
4. **API Testing**: Tested endpoints using FastAPI's TestClient
5. **Production Readiness**: Created deployable API file with health checks

### API Design Best Practices

- ✅ **Use semantic HTTP methods** (GET for retrieval, POST for predictions)
- ✅ **Validate all inputs** with Pydantic models
- ✅ **Return appropriate status codes** (200, 400, 500, etc.)
- ✅ **Include health check endpoint** for monitoring
- ✅ **Version your API** (in URL or headers)
- ✅ **Document everything** (FastAPI does this automatically)
- ✅ **Log requests and errors** for debugging
- ✅ **Handle errors gracefully** with helpful messages

### Essential Endpoints for ML APIs

```
GET  /health        - Health check
GET  /model/info    - Model metadata
POST /predict       - Single prediction
POST /predict/batch - Batch predictions
GET  /docs          - Interactive API documentation
```

### Common Pitfalls to Avoid

- ❌ Not validating input data
- ❌ Returning 200 OK for errors
- ❌ Loading model on every request (load at startup)
- ❌ Not handling edge cases (empty input, wrong types)
- ❌ Missing health check endpoint
- ❌ No request logging
- ❌ Exposing internal errors to users

### What's Next

In **Module 05: Containerization with Docker**, we'll learn:
- Creating Dockerfiles for ML applications
- Building and running Docker containers
- Docker Compose for multi-container applications
- Best practices for containerizing ML models

### Additional Resources

- **FastAPI Documentation**: https://fastapi.tiangolo.com/
- **Pydantic Documentation**: https://pydantic-docs.helpmanual.io/
- **REST API Best Practices**: https://restfulapi.net/
- **API Design Guide**: https://cloud.google.com/apis/design

---

## Next Steps

Proceed to **Module 05: Containerization with Docker** to learn how to package your API in containers for consistent deployment.

**Before moving on, ensure you can:**
- ✅ Create FastAPI applications for ML model serving
- ✅ Define request/response models with Pydantic
- ✅ Implement proper error handling
- ✅ Test API endpoints programmatically
- ✅ Create health check and metadata endpoints
- ✅ Generate automatic API documentation