# Part IX: Production Deployment

## Chapter 20: Deployment Strategies

Deploying FastAPI applications to production requires careful orchestration of process management, reverse proxy configuration, and cloud infrastructure. This chapter bridges the gap between containerized applications and live production environments, covering traditional process managers, reverse proxy setup, modern cloud platforms, and automated deployment pipelines.

---

### 20.1 Process Managers: Using Gunicorn with Uvicorn Workers

In production, FastAPI applications should run behind a process manager that handles worker spawning, load balancing, and graceful restarts. Gunicorn (Green Unicorn) is the standard WSGI/ASGI process manager for Python applications.

#### Understanding the Gunicorn Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                    Gunicorn Process Architecture                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Master Process (Gunicorn)                                       │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │  • Manages worker processes                             │    │
│  │  • Handles signals (HUP, TERM, USR1, etc.)             │    │
│  │  • Monitors workers (restarts crashed workers)          │    │
│  │  • Load balances between workers                        │    │
│  └─────────────────────────────────────────────────────────┘    │
│       │                                                          │
│       │ Forks                                                   │
│       ▼                                                          │
│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐              │
│  │ Worker 0│ │ Worker 1│ │ Worker 2│ │ Worker 3│ ...            │
│  │ (Uvicorn│ │ (Uvicorn│ │ (Uvicorn│ │ (Uvicorn│                │
│  │ Worker) │ │ Worker) │ │ Worker) │ │ Worker) │                │
│  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘                │
│       │           │           │           │                      │
│       └───────────┴───────────┴───────────┘                      │
│                   │                                              │
│                   ▼                                              │
│            Event Loop (per worker)                              │
│       ┌─────────────────────────┐                                │
│       │  FastAPI Application    │                                │
│       │  • Request handling     │                                │
│       │  • Database connections │                                │
│       │  • Async operations     │                                │
│       └─────────────────────────┘                                │
│                                                                  │
│  Worker Types:                                                   │
│  • sync: Traditional synchronous workers (WSGI)                  │
│  • eventlet/gevent: Green thread workers                         │
│  • uvicorn.workers.UvicornWorker: ASGI async (recommended)      │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
```

#### Production Gunicorn Configuration

```python
# gunicorn.conf.py - Production configuration
import os
import multiprocessing
import logging

# Server socket binding
bind = os.getenv("BIND", "0.0.0.0:8000")

# Worker processes
# Formula: (2 x CPU cores) + 1 for I/O bound applications
workers = int(os.getenv("WEB_CONCURRENCY", multiprocessing.cpu_count() * 2 + 1))

# Worker class - CRITICAL for FastAPI/ASGI
worker_class = "uvicorn.workers.UvicornWorker"

# Alternative worker classes:
# - uvicorn.workers.UvicornH11Worker: Pure Python HTTP (slower, no uvloop)
# - uvicorn.workers.UvicornWorker: Cython-based (faster, requires uvloop)

# Worker connections (for eventlet/gevent, ignored by UvicornWorker)
worker_connections = int(os.getenv("WORKER_CONNECTIONS", "1000"))

# Maximum requests before worker restart (prevents memory leaks)
max_requests = int(os.getenv("MAX_REQUESTS", "1000"))
max_requests_jitter = int(os.getenv("MAX_REQUESTS_JITTER", "50"))
# Jitter prevents all workers from restarting simultaneously

# Timeouts (seconds)
timeout = int(os.getenv("TIMEOUT", "30"))  # Worker silence timeout
graceful_timeout = int(os.getenv("GRACEFUL_TIMEOUT", "30"))  # Shutdown wait
keepalive = int(os.getenv("KEEPALIVE", "2"))  # Persistent connection timeout

# Logging configuration
accesslog = "-"  # stdout
errorlog = "-"   # stdout
loglevel = os.getenv("LOG_LEVEL", "info")
access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" %(D)s %(p)s'

# Process naming
proc_name = "fastapi_app"

# Server mechanics
daemon = False  # Run in foreground (Docker best practice)
pidfile = "/tmp/gunicorn.pid"

# SSL (usually handled by reverse proxy, but available)
forwarded_allow_ips = "*"  # Trust X-Forwarded-* headers from any IP
secure_scheme_headers = {
    'X-FORWARDED-PROTOCOL': 'ssl',
    'X-FORWARDED-PROTO': 'https',
    'X-FORWARDED-SSL': 'on'
}

# Preload application for memory efficiency
# Loads application code before forking workers (copy-on-write)
preload_app = True

# Worker temporary directory
# /dev/shm is tmpfs (RAM), faster than disk
worker_tmp_dir = "/dev/shm"

# Server hooks for lifecycle management
def on_starting(server):
    """Called just before the master process is initialized."""
    logging.info("Gunicorn starting...")

def on_reload(server):
    """Called when receiving SIGHUP (configuration reload)."""
    logging.info("Configuration reload requested")

def when_ready(server):
    """Called just after the server is started."""
    logging.info(f"Gunicorn ready with {server.num_workers} workers")

def worker_int(worker):
    """Called when a worker receives SIGINT or SIGQUIT."""
    logging.warning(f"Worker {worker.pid} interrupted")

def on_exit(server):
    """Called just before exiting Gunicorn."""
    logging.info("Gunicorn shutting down...")

# Application preloading (if using preload_app=True)
def on_arbiter_start():
    """Called before workers are forked (if preload_app)."""
    pass

def post_worker_init(worker):
    """Called just after a worker has been initialized."""
    # Set up worker-specific resources (database connections, etc.)
    pass

def worker_abort(worker):
    """Called when a worker receives SIGABRT."""
    logging.error(f"Worker {worker.pid} aborted")
```

**Running Gunicorn:**

```bash
# Basic command
gunicorn -c gunicorn.conf.py app.main:app

# With hot reload (development only)
gunicorn -c gunicorn.conf.py app.main:app --reload

# With custom workers
gunicorn -w 4 -k uvicorn.workers.UvicornWorker app.main:app

# With SSL (if not using reverse proxy)
gunicorn -c gunicorn.conf.py --certfile=/path/to/cert.pem --keyfile=/path/to/key.pem app.main:app
```

#### Graceful Shutdown Handling

```python
# lifespan.py - Handling graceful shutdown
from contextlib import asynccontextmanager
from fastapi import FastAPI
import asyncio
import logging

@asynccontextmanager
async def lifespan(app: FastAPI):
    """
    Lifespan context manager with graceful shutdown support.
    
    Gunicorn sends SIGTERM when shutting down.
    """
    # Startup
    logging.info("Starting up...")
    await init_database()
    await init_redis()
    
    yield  # Application runs here
    
    # Shutdown (called on SIGTERM)
    logging.info("Shutting down gracefully...")
    
    # Stop accepting new connections
    # (Gunicorn handles this by stopping workers)
    
    # Close database connections
    await close_database()
    
    # Flush any pending tasks
    await asyncio.sleep(1)  # Allow current requests to complete
    
    logging.info("Shutdown complete")

app = FastAPI(lifespan=lifespan)
```

---

### 20.2 Reverse Proxies: Nginx Configuration for FastAPI

Nginx acts as a reverse proxy, handling client connections, SSL termination, static file serving, and load balancing before forwarding requests to Gunicorn.

#### Nginx Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                    Nginx Reverse Proxy Setup                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Client                                                          │
│    │                                                             │
│    │ HTTPS (443)                                                │
│    ▼                                                             │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                    Nginx                                 │    │
│  │  ┌─────────────────────────────────────────────────────┐ │    │
│  │  │  SSL/TLS Termination                               │ │    │
│  │  │  • Certificate validation                          │ │    │
│  │  │  • TLS 1.3, HTTP/2                                │ │    │
│  │  └─────────────────────────────────────────────────────┘ │    │
│  │                          │                               │    │
│  │  ┌───────────────────────▼─────────────────────────────┐ │    │
│  │  │  Static File Serving                               │ │    │
│  │  │  • /static/* → /var/www/static                     │ │    │
│  │  │  • Cache headers, gzip compression                  │ │    │
│  │  └─────────────────────────────────────────────────────┘ │    │
│  │                          │                               │    │
│  │  ┌───────────────────────▼─────────────────────────────┐ │    │
│  │  │  Reverse Proxy to Gunicorn                          │ │    │
│  │  │  • Load balancing (upstream)                      │ │    │
│  │  │  • Rate limiting                                   │ │    │
│  │  │  • Buffering                                       │ │    │
│  │  │  • Retries, timeouts                               │ │    │
│  │  └─────────────────────────────────────────────────────┘ │    │
│  └─────────────────────────────────────────────────────────┘    │
│       │                                                          │
│       │ HTTP (8000)                                              │
│       ▼                                                          │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                Gunicorn + Uvicorn                        │    │
│  │                FastAPI Application                       │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                  │
│  Benefits:                                                       │
│  • Nginx handles slow clients (buffering)                        │
│  • SSL certificates managed in one place                          │
│  • Static files served efficiently (sendfile)                    │
│  • Multiple app instances load balanced                          │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
```

#### Production Nginx Configuration

```nginx
# /etc/nginx/nginx.conf - Main configuration
user nginx;
worker_processes auto;  # Match CPU cores
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

# Load dynamic modules
load_module modules/ngx_http_headers_more_filter_module.so;

events {
    worker_connections 4096;  # Max connections per worker
    use epoll;  # Efficient on Linux
    multi_accept on;  # Accept multiple connections per cycle
}

http {
    # Basic Settings
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
    
    # Logging format
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for" '
                    '$request_time $upstream_response_time';
    
    access_log /var/log/nginx/access.log main;
    
    # Performance
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    
    # Gzip compression
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css text/xml application/json application/javascript application/rss+xml application/atom+xml image/svg+xml;
    
    # Rate limiting zones
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_conn_zone $binary_remote_addr zone=addr:10m;
    
    # Upstream (Gunicorn backend)
    upstream fastapi_app {
        # Load balancing method
        least_conn;  # Send to server with least connections
        
        # Multiple backend instances (if running multiple Gunicorn processes)
        server 127.0.0.1:8000 weight=5;
        server 127.0.0.1:8001 weight=5;
        # server unix:/tmp/gunicorn.sock;  # Unix socket alternative
        
        keepalive 32;  # Keep connections open
    }
    
    # HTTP to HTTPS redirect
    server {
        listen 80;
        server_name api.example.com;
        return 301 https://$server_name$request_uri;
    }
    
    # HTTPS Server
    server {
        listen 443 ssl http2;
        server_name api.example.com;
        
        # SSL Configuration
        ssl_certificate /etc/nginx/ssl/fullchain.pem;
        ssl_certificate_key /etc/nginx/ssl/privkey.pem;
        ssl_trusted_certificate /etc/nginx/ssl/chain.pem;
        
        # Modern SSL configuration
        ssl_protocols TLSv1.2 TLSv1.3;
        ssl_prefer_server_ciphers off;
        ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
        ssl_session_timeout 1d;
        ssl_session_cache shared:SSL:50m;
        ssl_stapling on;
        ssl_stapling_verify on;
        
        # Security headers
        add_header X-Frame-Options "SAMEORIGIN" always;
        add_header X-Content-Type-Options "nosniff" always;
        add_header X-XSS-Protection "1; mode=block" always;
        add_header Referrer-Policy "strict-origin-when-cross-origin" always;
        add_header Content-Security-Policy "default-src 'self'; script-src 'self';" always;
        
        # Remove server version header
        more_clear_headers Server;
        
        # Static files (if serving from FastAPI)
        location /static {
            alias /var/www/static;
            expires 1y;
            add_header Cache-Control "public, immutable";
            access_log off;
        }
        
        # Health check endpoint (bypass rate limiting)
        location /health {
            access_log off;
            proxy_pass http://fastapi_app;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
        
        # Main API location
        location / {
            # Rate limiting
            limit_req zone=api burst=20 nodelay;
            limit_conn addr 10;
            
            # Proxy to Gunicorn
            proxy_pass http://fastapi_app;
            proxy_http_version 1.1;
            
            # Headers
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            proxy_set_header X-Request-ID $request_id;
            
            # Timeouts (must be longer than Gunicorn timeouts)
            proxy_connect_timeout 30s;
            proxy_send_timeout 30s;
            proxy_read_timeout 30s;
            
            # Buffering - IMPORTANT for FastAPI streaming
            proxy_buffering on;
            proxy_buffer_size 4k;
            proxy_buffers 8 4k;
            proxy_busy_buffers_size 8k;
            
            # WebSocket support
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
        }
        
        # Error pages
        error_page 500 502 503 504 /50x.html;
        location = /50x.html {
            root /var/www/errors;
        }
    }
}
```

**Nginx Configuration Explained:**

1. **`upstream fastapi_app`**: Defines the backend pool. Can use Unix sockets (faster, same machine) or TCP (networked). `least_conn` distributes load to the server with fewest active connections.

2. **`proxy_buffering on`**: Nginx buffers the entire response from Gunicorn before sending to the client. This prevents slow clients from tying up Gunicorn workers. Disable (`off`) for Server-Sent Events (SSE).

3. **`proxy_read_timeout`**: Must be longer than Gunicorn's `timeout` setting, or Nginx will kill long-running requests.

4. **`limit_req zone=api burst=20 nodelay`**: Rate limiting - allows 10 requests per second with bursts of 20. Excess requests get 503 errors.

5. **WebSocket headers**: `Upgrade` and `Connection` headers are essential for WebSocket proxying.

#### Docker Compose with Nginx

```yaml
# docker-compose.prod.yml with Nginx
version: "3.8"

services:
  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./nginx/ssl:/etc/nginx/ssl:ro
      - static_files:/var/www/static:ro
    depends_on:
      - app
    networks:
      - frontend
      - backend

  app:
    build: .
    expose:
      - "8000"
    environment:
      - DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/app
      - REDIS_URL=redis://redis:6379/0
    depends_on:
      - db
      - redis
    networks:
      - backend
    command: gunicorn -c gunicorn.conf.py app.main:app

  db:
    image: postgres:15-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD=postgres
    networks:
      - backend

  redis:
    image: redis:7-alpine
    networks:
      - backend

volumes:
  postgres_data:
  static_files:

networks:
  frontend:
  backend:
    internal: true
```

---

### 20.3 Cloud Deployment: AWS, GCP, and Modern Platforms

#### AWS Deployment (ECS with Fargate)

```json
// ecs-task-definition.json - AWS ECS Task Definition
{
  "family": "fastapi-app",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskRole",
  "containerDefinitions": [
    {
      "name": "fastapi-app",
      "image": "ACCOUNT.dkr.ecr.REGION.amazonaws.com/fastapi-app:latest",
      "portMappings": [
        {
          "containerPort": 8000,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {
          "name": "DATABASE_URL",
          "value": "postgresql+asyncpg://user:pass@db.cluster-xxx.us-east-1.rds.amazonaws.com:5432/app"
        },
        {
          "name": "SECRET_KEY",
          "value": "production-secret-key"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/fastapi-app",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 60
      }
    }
  ]
}
```

**AWS Services for FastAPI:**
- **ECS (Elastic Container Service)**: Run containers with Fargate (serverless) or EC2
- **RDS**: Managed PostgreSQL
- **ElastiCache**: Managed Redis
- **Application Load Balancer**: Distributes traffic across containers
- **Secrets Manager**: Store database credentials, API keys
- **CloudWatch**: Logging and monitoring

#### Google Cloud Platform (Cloud Run)

```yaml
# cloudbuild.yaml - GCP Cloud Build
steps:
  # Build the container image
  - name: 'gcr.io/cloud-builders/docker'
    args: ['build', '-t', 'gcr.io/$PROJECT_ID/fastapi-app:$COMMIT_SHA', '.']
  
  # Push the image to Container Registry
  - name: 'gcr.io/cloud-builders/docker'
    args: ['push', 'gcr.io/$PROJECT_ID/fastapi-app:$COMMIT_SHA']
  
  # Deploy to Cloud Run
  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    entrypoint: gcloud
    args:
      - 'run'
      - 'deploy'
      - 'fastapi-app'
      - '--image'
      - 'gcr.io/$PROJECT_ID/fastapi-app:$COMMIT_SHA'
      - '--region'
      - 'us-central1'
      - '--platform'
      - 'managed'
      - '--allow-unauthenticated'
      - '--set-env-vars'
      - 'DATABASE_URL=${_DATABASE_URL},SECRET_KEY=${_SECRET_KEY}'
      - '--set-secrets'
      - 'DB_PASSWORD=db-password:latest'
      - '--memory'
      - '1Gi'
      - '--cpu'
      - '1'
      - '--concurrency'
      - '80'
      - '--max-instances'
      - '10'

images:
  - 'gcr.io/$PROJECT_ID/fastapi-app:$COMMIT_SHA'

substitutions:
  _DATABASE_URL: postgresql+asyncpg://user@/app
```

**Cloud Run Features:**
- **Serverless**: Scales to zero (no cost when idle), auto-scales up
- **HTTP/2**: Native support for gRPC and WebSockets (with some config)
- **Global HTTPS**: Automatic SSL certificates
- **Traffic splitting**: A/B testing, gradual rollouts

#### Modern Platforms (Render, Fly.io, Railway)

**Render (Simplest):**
```yaml
# render.yaml - Render Blueprint
services:
  - type: web
    name: fastapi-app
    runtime: python
    buildCommand: pip install -r requirements.txt
    startCommand: gunicorn -c gunicorn.conf.py app.main:app
    envVars:
      - key: DATABASE_URL
        fromDatabase:
          name: postgres-db
          property: connectionString
      - key: SECRET_KEY
        generateValue: true
      - key: PYTHON_VERSION
        value: 3.11.0

databases:
  - name: postgres-db
    databaseName: fastapi_db
    user: fastapi_user
```

**Fly.io (Global Edge):**
```toml
# fly.toml - Fly.io Configuration
app = "fastapi-app"
primary_region = "iad"

[build]
  dockerfile = "Dockerfile"

[env]
  PORT = "8000"
  DATABASE_URL = "postgres://user:pass@top2.nearest.of.fastapi-db.internal:5432/app"

[http_service]
  internal_port = 8000
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0
  processes = ["app"]

[[vm]]
  cpu_kind = "shared"
  cpus = 1
  memory_mb = 1024

[deploy]
  strategy = "rolling"
```

**Railway (Developer Experience):**
```dockerfile
# Dockerfile (Railway uses Nixpacks by default, but Dockerfile works)
FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
CMD ["gunicorn", "-w", "4", "-k", "uvicorn.workers.UvicornWorker", "app.main:app"]
```

---

### 20.4 CI/CD Pipelines: GitHub Actions for Deployment

Automated deployment ensures consistent, tested releases with minimal manual intervention.

#### Complete GitHub Actions Workflow

```yaml
# .github/workflows/deploy.yml - Full deployment pipeline
name: CI/CD Pipeline

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  test:
    runs-on: ubuntu-latest
    
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: test_db
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432
      
      redis:
        image: redis:7
        ports:
          - 6379:6379

    steps:
    - uses: actions/checkout@v4
    
    - name: Set up Python
      uses: actions/setup-python@v5
      with:
        python-version: '3.11'
    
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r requirements.txt
        pip install -r requirements-dev.txt
    
    - name: Lint with ruff
      run: |
        ruff check app/ tests/
        ruff format --check app/ tests/
    
    - name: Type check with mypy
      run: mypy app/ --strict
    
    - name: Test with pytest
      env:
        DATABASE_URL: postgresql+asyncpg://postgres:postgres@localhost:5432/test_db
        REDIS_URL: redis://localhost:6379/0
        SECRET_KEY: test-secret-key
      run: |
        pytest tests/ -v --cov=app --cov-report=xml --cov-report=term
    
    - name: Upload coverage
      uses: codecov/codecov-action@v3
      with:
        file: ./coverage.xml

  build:
    needs: test
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    
    steps:
    - name: Checkout
      uses: actions/checkout@v4
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v3
    
    - name: Log in to Container Registry
      uses: docker/login-action@v3
      with:
        registry: ${{ env.REGISTRY }}
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
    
    - name: Extract metadata
      id: meta
      uses: docker/metadata-action@v5
      with:
        images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
        tags: |
          type=ref,event=branch
          type=ref,event=pr
          type=semver,pattern={{version}}
          type=semver,pattern={{major}}.{{minor}}
          type=sha,prefix=,suffix=,format=short
    
    - name: Build and push
      uses: docker/build-push-action@v5
      with:
        context: .
        push: true
        tags: ${{ steps.meta.outputs.tags }}
        labels: ${{ steps.meta.outputs.labels }}
        cache-from: type=gha
        cache-to: type=gha,mode=max

  deploy-staging:
    needs: build
    runs-on: ubuntu-latest
    environment: staging
    if: github.ref == 'refs/heads/main'
    
    steps:
    - name: Deploy to Staging
      run: |
        echo "Deploying to staging environment..."
        # SSH into server and pull new image
        # Or use AWS ECS update-service, kubectl apply, etc.
        
  deploy-production:
    needs: deploy-staging
    runs-on: ubuntu-latest
    environment: production
    if: github.ref == 'refs/heads/main'
    
    steps:
    - name: Deploy to Production
      run: |
        echo "Deploying to production..."
        # Production deployment steps
        
    - name: Run Database Migrations
      run: |
        # Run Alembic migrations
        docker run --rm \
          -e DATABASE_URL=${{ secrets.DATABASE_URL }} \
          ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:main \
          alembic upgrade head
        
    - name: Verify Deployment
      run: |
        # Health check
        curl -f https://api.example.com/health || exit 1
        
    - name: Rollback on Failure
      if: failure()
      run: |
        echo "Deployment failed, rolling back..."
        # Implement rollback logic
```

**Deployment Strategies:**

1. **Blue-Green Deployment**: Run two identical environments, switch traffic instantly
2. **Rolling Deployment**: Gradually replace old containers with new ones
3. **Canary Deployment**: Route small percentage of traffic to new version, monitor, then increase

```bash
# Database Migration Strategy (Zero-Downtime)
# 1. Migrations must be backward compatible
# 2. Run migrations before deploying new code
# 3. Never delete columns in same deploy that stops using them

# Example safe migration pattern:
# Deploy 1: Add new column (nullable)
# Deploy 2: Update code to write to both columns
# Deploy 3: Backfill data
# Deploy 4: Update code to read from new column
# Deploy 5: Drop old column
```

---

### Summary

In this chapter, you deployed FastAPI to production:

1. **Process Managers**: Configured Gunicorn with Uvicorn workers for production ASGI serving, set worker counts based on CPU cores, implemented graceful shutdown handling, and configured logging and health checks.

2. **Reverse Proxies**: Set up Nginx for SSL termination, static file serving, rate limiting, and load balancing. Configured proxy buffering, timeouts, and WebSocket support for optimal FastAPI performance.

3. **Cloud Deployment**: Deployed to AWS ECS/Fargate for container orchestration, Google Cloud Run for serverless scaling, and modern platforms (Render, Fly.io, Railway) for simplified developer experience.

4. **CI/CD Pipelines**: Built GitHub Actions workflows for testing, building container images, pushing to registries, and automated deployment with database migrations and health verification.

**Production Deployment Checklist:**
- Use Gunicorn with Uvicorn workers (not raw Uvicorn)
- Place Nginx in front for SSL and static files
- Run database migrations before code deployment
- Implement health checks and graceful shutdowns
- Use environment variables for configuration (12-factor app)
- Set up automated rollback on failure
- Monitor logs and metrics from day one

---

### What's Next?

**Chapter 21: Performance Tuning** will cover:
- **Concurrency Settings**: Tuning worker counts, thread pools, and connection limits based on workload characteristics
- **Caching**: Implementing Redis caching strategies including memoization, response caching, and cache invalidation patterns
- **Connection Pooling**: Optimizing database connection pools, HTTP client pools, and external service connections to prevent resource exhaustion

This next chapter focuses on optimizing your production deployment for maximum throughput and efficiency.

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='19. containerization.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='21. performance_tuning.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
