# Bridge M3.1 ‚Üí M3.2: Readiness Checklist

## üéØ Purpose

**What shifts:** You've containerized your application with Docker (M3.1). Now you need to verify that your local setup is production-ready before deploying to cloud platforms (M3.2).

**Why it matters:** Cloud deployment failures often stem from issues that could be caught locally. Missing environment variables, leaked secrets in Git history, or unhealthy Docker containers will waste 30-90 minutes of remote debugging. This readiness checklist catches those issues in 5 minutes locally.

**The gap:** Docker made your app portable, but you haven't verified it's cloud-ready. This bridge ensures your Dockerfile, environment configuration, and Git hygiene meet production standards before you push to Railway or Render.

## üìö Concepts Covered

**Delta from M3.1:**
- Pre-deployment validation patterns (health checks, configuration audits)
- Secrets management best practices (.env.example pattern, Git history scanning)
- Multi-environment configuration strategies (staging vs. production)
- Offline-first notebook execution (graceful fallbacks for missing tools)

**Not covered:** Actual cloud deployment (that's M3.2), CI/CD pipelines, or infrastructure-as-code.

## ‚úÖ After Completing

You will be able to verify:
- ‚úì Your Docker Compose stack starts successfully and all services are healthy
- ‚úì All required environment variables are documented in .env.example with no secrets leaked
- ‚úì Your Git repository has no sensitive files in history (.env, credentials.json, etc.)
- ‚úì (Optional) You have multi-environment configuration experience with staging compose files
- ‚úì Your project meets cloud platform requirements for Railway and Render deployment

**Pass criteria:** All 3 required checks (Docker health, .env.example, Git secrets) must pass. The staging check is optional but recommended.

## üó∫Ô∏è Context in Track

**Module:** L1.M3 - Production Deployment  
**Bridge Type:** Within-Module (M3.1 ‚Üí M3.2)  
**Duration:** 30-45 minutes  
**Previous:** M3.1 - Containerization with Docker (Dockerfile, docker-compose, volumes)  
**Next:** M3.2 - Cloud Deployment (Railway/Render PaaS platforms)

---

### üíª Run Locally (Windows)

```powershell
# From your project root:
powershell -c "$env:PYTHONPATH='$PWD'; jupyter notebook"
```

**Linux/Mac:**
```bash
export PYTHONPATH="$PWD" && jupyter notebook
```

---

## Section 1: RECAP - What You Accomplished

### Congratulations! You completed M3.1: Containerization with Docker

Here's what you built:

### ‚úì Complete Docker containerization of your RAG system
- Created production-ready Dockerfile with optimized layer caching
- Multi-stage builds for 40% faster rebuilds
- Non-root user for security hardening

### ‚úì Multi-service orchestration with docker-compose
- Integrated RAG API, Redis cache, and vector database
- Single orchestrated stack that starts with one command
- Automatic networking management

### ‚úì Data persistence strategy implemented
- Volume mounts for document storage and embeddings
- Data survives container restarts
- Updates possible without rebuilding images

### ‚úì Five common Docker failures debugged live
- Port conflicts
- Volume permission errors
- Networking issues
- Environment variable problems
- Image build failures

---

**Key Achievement:** Your RAG system is now portable and runs identically on any machine with Docker installed. Your Dockerfile and docker-compose.yml are production artifacts - the foundation for cloud deployment.

---

## Section 2: CHECK - Docker Compose Health

**Requirement:** Docker stack runs successfully with `docker-compose up`

**Impact:** Failed local deployment means cloud deployment will also fail - wastes 30-60 minutes debugging remotely instead of locally

**What to verify:**
- Run `docker-compose up -d` and `docker-compose ps` shows all services healthy
- All containers are in "running" state
- No restart loops or exit codes

---

**What this check does:** Verifies that Docker and docker-compose are installed, that docker-compose.yml exists, and that all services start successfully. If Docker is unavailable, the check skips gracefully with a warning.

In [None]:
import subprocess
import os

def check_docker_compose_health():
    """Check if docker-compose services are healthy"""
    print("üîç Checking Docker Compose Health...\n")
    
    # Check if docker-compose.yml exists
    if not os.path.exists('docker-compose.yml'):
        print("‚ö†Ô∏è  WARNING: docker-compose.yml not found in current directory")
        print("   This check will be skipped - ensure you have this file before cloud deployment\n")
        return False
    
    # Offline-friendly: Skip if Docker not available
    try:
        result = subprocess.run(['docker', '--version'], 
                              capture_output=True, text=True, timeout=5)
        if result.returncode != 0:
            print("‚ö†Ô∏è  WARNING: Docker is not available on this system")
            print("   This check will be skipped - install Docker to verify locally\n")
            return False
        
        print(f"‚úì Docker detected: {result.stdout.strip()}\n")
        
        # Try to check docker-compose status
        result = subprocess.run(['docker-compose', 'ps'], 
                              capture_output=True, text=True, timeout=10)
        
        if result.returncode == 0:
            print("Docker Compose Status:")
            print(result.stdout)
            
            if "Up" in result.stdout:
                print("‚úÖ PASS: Docker containers are running")
                return True
            elif result.stdout.strip() == "" or "Name" in result.stdout and len(result.stdout.split('\n')) <= 2:
                print("‚ö†Ô∏è  No containers currently running")
                print("   Run: docker-compose up -d")
                return False
            else:
                print("‚ö†Ô∏è  WARNING: Some containers may have issues")
                return False
        else:
            print("‚ö†Ô∏è  Could not check docker-compose status")
            print(f"   Error: {result.stderr}")
            print("\n   Manual check: Run 'docker-compose up -d' then 'docker-compose ps'\n")
            return False
            
    except FileNotFoundError:
        print("‚ö†Ô∏è  WARNING: docker or docker-compose command not found")
        print("   Install Docker and Docker Compose to verify locally")
        print("   For cloud deployment, this will be handled by the platform\n")
        return False
    except Exception as e:
        print(f"‚ö†Ô∏è  Error checking Docker: {e}")
        print("   This check will be skipped\n")
        return False

# Run the check
check_docker_compose_health()

## Section 3: CHECK - Environment Variables Documentation

**Requirement:** All environment variables documented in .env.example

**Impact:** Missing environment variables cause production startup failures - prevents 40% of deployment issues

**What to verify:**
- `.env.example` file exists with placeholder values (no secrets)
- All required variables are documented
- Each variable has a descriptive comment or example value

---

**What this check does:** Scans for .env.example, compares it with .env (if present), and validates that no real secrets are committed. Uses pattern matching to detect common secret formats like OpenAI API keys.

In [None]:
import os
import re

def check_env_example():
    """Check if .env.example exists and is properly documented"""
    print("üîç Checking .env.example Completeness...\n")
    
    env_example_path = '.env.example'
    env_path = '.env'
    
    # Check if .env.example exists
    if not os.path.exists(env_example_path):
        print("‚ùå FAIL: .env.example not found")
        print("\nüìã Action Required:")
        print("   1. Create .env.example with all required environment variables")
        print("   2. Use placeholder values (NO SECRETS)")
        print("   3. Add comments explaining each variable\n")
        print("Example .env.example content:")
        print("   # API Keys")
        print("   OPENAI_API_KEY=sk-your-key-here")
        print("   # Database")
        print("   DATABASE_URL=postgresql://user:pass@localhost:5432/dbname")
        print("   # Redis")
        print("   REDIS_URL=redis://localhost:6379\n")
        return False
    
    # Read .env.example
    with open(env_example_path, 'r') as f:
        example_content = f.read()
    
    # Parse environment variables from .env.example
    example_vars = set()
    for line in example_content.split('\n'):
        line = line.strip()
        if line and not line.startswith('#') and '=' in line:
            var_name = line.split('=')[0].strip()
            example_vars.add(var_name)
    
    print(f"‚úì .env.example found with {len(example_vars)} variables\n")
    
    if len(example_vars) == 0:
        print("‚ö†Ô∏è  WARNING: .env.example exists but has no variables defined")
        return False
    
    print("Variables in .env.example:")
    for var in sorted(example_vars):
        print(f"   ‚Ä¢ {var}")
    print()
    
    # Check if .env exists and compare
    if os.path.exists(env_path):
        with open(env_path, 'r') as f:
            env_content = f.read()
        
        env_vars = set()
        for line in env_content.split('\n'):
            line = line.strip()
            if line and not line.startswith('#') and '=' in line:
                var_name = line.split('=')[0].strip()
                env_vars.add(var_name)
        
        # Find variables in .env but not in .env.example
        missing_in_example = env_vars - example_vars
        if missing_in_example:
            print("‚ö†Ô∏è  WARNING: Variables in .env but NOT in .env.example:")
            for var in sorted(missing_in_example):
                print(f"   ‚Ä¢ {var}")
            print("\n   These should be added to .env.example (with placeholder values)\n")
            return False
        
        # Find variables in .env.example but not in .env
        missing_in_env = example_vars - env_vars
        if missing_in_env:
            print("‚ÑπÔ∏è  INFO: Variables in .env.example but not in .env:")
            for var in sorted(missing_in_env):
                print(f"   ‚Ä¢ {var}")
            print("\n   (This is OK if these are optional or environment-specific)\n")
    
    # Check for common security issues in .env.example
    security_issues = []
    common_secret_patterns = [
        (r'sk-[a-zA-Z0-9]{32,}', 'OpenAI API key'),
        (r'[a-f0-9]{32,}', 'Potential secret token'),
        (r'password["\']?\s*[:=]\s*["\']?[^\s"\']+', 'Password value')
    ]
    
    for pattern, desc in common_secret_patterns:
        if re.search(pattern, example_content, re.IGNORECASE):
            security_issues.append(desc)
    
    if security_issues:
        print("‚ö†Ô∏è  WARNING: Potential secrets detected in .env.example:")
        for issue in security_issues:
            print(f"   ‚Ä¢ {issue}")
        print("\n   .env.example should only contain PLACEHOLDER values, not real secrets\n")
        return False
    
    print("‚úÖ PASS: .env.example is properly documented")
    print("   All variables have placeholder values, no secrets detected\n")
    return True

# Run the check
check_env_example()

## Section 4: CHECK - Git History Secrets Scan

**Requirement:** GitHub repository with clean history (no .env committed)

**Impact:** Leaked secrets in Git history require repository deletion and recreation - costs 2-4 hours to clean up

**What to verify:**
- Run `git log --all --full-history -- .env` returns nothing (file never committed)
- Check that .env is in .gitignore
- Verify no secrets in any committed files

---

**What this check does:** Searches Git history for sensitive files (.env, secrets.json, credentials.json), validates .gitignore patterns, and scans committed files for suspicious names. Skips gracefully if Git is not available.

In [None]:
import subprocess
import os

def check_git_secrets():
    """Check if secrets have been committed to git history"""
    print("üîç Checking Git History for Secrets...\n")
    
    # Offline-friendly: Skip if not a git repository
    if not os.path.exists('.git'):
        print("‚ö†Ô∏è  WARNING: Not a git repository")
        print("   Initialize git with: git init")
        print("   This check will be skipped\n")
        return False
    
    try:
        # Offline-friendly: Skip if git not available
        result = subprocess.run(['git', '--version'], 
                              capture_output=True, text=True, timeout=5)
        if result.returncode != 0:
            print("‚ö†Ô∏è  WARNING: Git is not available")
            print("   This check will be skipped\n")
            return False
        
        print(f"‚úì Git detected: {result.stdout.strip()}\n")
        
        # Check if .env is in git history
        sensitive_files = ['.env', '.env.local', '.env.production', 'secrets.json', 'credentials.json']
        found_secrets = []
        
        for file in sensitive_files:
            result = subprocess.run(
                ['git', 'log', '--all', '--full-history', '--', file],
                capture_output=True, text=True, timeout=10
            )
            
            if result.stdout.strip():
                found_secrets.append(file)
        
        if found_secrets:
            print("‚ùå FAIL: Sensitive files found in git history:")
            for file in found_secrets:
                print(f"   ‚Ä¢ {file}")
            print("\nüö® CRITICAL: These files contain secrets and should NEVER be in git!")
            print("\nüìã Action Required:")
            print("   Option 1 (Recommended if repo not shared):")
            print("      1. Use git filter-branch or BFG Repo-Cleaner to remove from history")
            print("      2. Force push to remote (WARNING: Destructive)")
            print("\n   Option 2 (If already shared publicly):")
            print("      1. Rotate ALL secrets immediately")
            print("      2. Create new repository")
            print("      3. Migrate code without sensitive files\n")
            return False
        
        print("‚úì No sensitive files (.env, secrets.json, etc.) found in git history\n")
        
        # Check if .gitignore exists and includes .env
        if os.path.exists('.gitignore'):
            with open('.gitignore', 'r') as f:
                gitignore_content = f.read()
            
            # Check for common patterns
            patterns_to_check = ['.env', '*.env', '.env.*']
            found_patterns = []
            
            for pattern in patterns_to_check:
                if pattern in gitignore_content:
                    found_patterns.append(pattern)
            
            if found_patterns:
                print("‚úì .gitignore includes environment file patterns:")
                for pattern in found_patterns:
                    print(f"   ‚Ä¢ {pattern}")
                print()
            else:
                print("‚ö†Ô∏è  WARNING: .gitignore exists but doesn't include .env patterns")
                print("   Add these lines to .gitignore:")
                print("      .env")
                print("      .env.*")
                print("      !.env.example\n")
                return False
        else:
            print("‚ö†Ô∏è  WARNING: .gitignore not found")
            print("   Create .gitignore with at minimum:")
            print("      .env")
            print("      .env.*")
            print("      !.env.example")
            print("      __pycache__/")
            print("      *.pyc")
            print("      venv/")
            print("      .venv/\n")
            return False
        
        # Check for accidentally committed secrets in current files
        print("üîç Scanning current committed files for potential secrets...\n")
        
        result = subprocess.run(
            ['git', 'ls-files'],
            capture_output=True, text=True, timeout=10
        )
        
        if result.returncode == 0:
            committed_files = result.stdout.strip().split('\n')
            suspicious_files = [f for f in committed_files if any(
                keyword in f.lower() for keyword in ['secret', 'password', 'credential', 'key', 'token']
            )]
            
            if suspicious_files:
                print("‚ö†Ô∏è  WARNING: Files with suspicious names found in git:")
                for file in suspicious_files[:10]:  # Show first 10
                    print(f"   ‚Ä¢ {file}")
                if len(suspicious_files) > 10:
                    print(f"   ... and {len(suspicious_files) - 10} more")
                print("\n   Review these files to ensure they don't contain real secrets\n")
        
        print("‚úÖ PASS: Git history is clean, no secrets detected")
        print("   .gitignore properly configured\n")
        return True
        
    except FileNotFoundError:
        print("‚ö†Ô∏è  WARNING: git command not found")
        print("   Install Git to verify history\n")
        return False
    except subprocess.TimeoutExpired:
        print("‚ö†Ô∏è  WARNING: Git command timed out")
        print("   Repository might be too large, check manually\n")
        return False
    except Exception as e:
        print(f"‚ö†Ô∏è  Error checking git history: {e}")
        print("   Manual check: Run 'git log --all --full-history -- .env'\n")
        return False

# Run the check
check_git_secrets()

## Section 5: OPTIONAL - Staging Configuration

**Requirement:** At least Easy challenge completed (staging configuration)

**Impact:** Understanding multi-environment setup prevents production configuration mistakes - saves 60-90 minutes of trial-and-error

**What to verify:**
- Separate `docker-compose.staging.yml` exists and works locally
- Environment-specific configurations are properly separated
- Production and staging differences are documented

---

**What this check does:** Searches for staging compose files (docker-compose.staging.yml, docker-compose.dev.yml), validates environment-specific configurations, and provides a template if none exists. This check is optional.

In [None]:
import os

def check_staging_config():
    """Check if staging configuration exists"""
    print("üîç Checking Staging Configuration (Optional)...\n")
    
    staging_files = [
        'docker-compose.staging.yml',
        'docker-compose.stage.yml',
        'docker-compose.dev.yml'
    ]
    
    found_staging = None
    for file in staging_files:
        if os.path.exists(file):
            found_staging = file
            break
    
    if found_staging:
        print(f"‚úì Staging configuration found: {found_staging}\n")
        
        # Read and display some info
        with open(found_staging, 'r') as f:
            content = f.read()
        
        lines = len(content.split('\n'))
        print(f"   File size: {lines} lines\n")
        
        # Check for environment-specific configurations
        env_indicators = ['environment:', 'env_file:', 'NODE_ENV', 'ENVIRONMENT', 'STAGE']
        found_indicators = [ind for ind in env_indicators if ind in content]
        
        if found_indicators:
            print("‚úì Environment-specific configurations detected:")
            for ind in found_indicators:
                print(f"   ‚Ä¢ {ind}")
            print()
        
        print("‚úÖ PASS: Staging configuration exists")
        print("   You have experience with multi-environment setup\n")
        
        print("üí° TIP: For production deployment, consider these differences:")
        print("   ‚Ä¢ Use managed databases instead of local containers")
        print("   ‚Ä¢ Enable HTTPS/SSL certificates")
        print("   ‚Ä¢ Set appropriate resource limits")
        print("   ‚Ä¢ Configure logging and monitoring")
        print("   ‚Ä¢ Use secrets management (not .env files)\n")
        
        return True
    else:
        print("‚ÑπÔ∏è  INFO: No staging configuration found")
        print("   This is optional but recommended\n")
        
        print("üìã Create docker-compose.staging.yml with this template:\n")
        template = """version: '3.8'

services:
  app:
    build: .
    ports:
      - "8000:8000"
    environment:
      - NODE_ENV=staging
      - LOG_LEVEL=info
    env_file:
      - .env.staging
    depends_on:
      - redis
      - db

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

  db:
    image: postgres:15-alpine
    environment:
      - POSTGRES_DB=myapp_staging
      - POSTGRES_USER=staging_user
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    volumes:
      - staging_db_data:/var/lib/postgresql/data

volumes:
  staging_db_data:
"""
        print(template)
        print("\n‚ö†Ô∏è  OPTIONAL: Complete this for better production readiness")
        print("   Skip if you're comfortable learning multi-environment setup during cloud deployment\n")
        
        return None  # Optional, so not a failure

# Run the check
check_staging_config()

## Section 6: CALL-FORWARD - What's Next in M3.2

### M3.2: Cloud Deployment (Railway/Render)

You're ready to deploy your containerized RAG system to production cloud platforms!

---

### What You'll Deploy

#### 1. Railway Deployment (Fast & Developer-Friendly)

**Key Features:**
- Deploy your containerized stack in under 10 minutes
- Automatic PostgreSQL and Redis provisioning
- Zero infrastructure configuration required
- Continuous deployment from GitHub

**Trade-offs:**
- Free tier services cold-start after 15 minutes of inactivity (30-60 second wake time)
- Acceptable for development
- Paid tier ($7/month) required for always-on production use

---

#### 2. Render Deployment (Production-Grade with Great Docs)

**Key Features:**
- Deploy the same containers with custom domain configuration
- Managed services with excellent documentation
- More configuration options for fine-tuning
- Production-grade reliability

**Comparison:** Platform comparison included to help you choose the right fit

---

#### 3. Automatic Deployments from GitHub

**Modern DevOps Without Complex CI/CD:**
- Every push to main automatically triggers deployment
- Change code, push, and production updates in 3-5 minutes
- No manual deployment steps
- No complex pipeline configuration

---

### The Core Question M3.2 Answers

**"How do I deploy Docker containers to production without managing servers, configuring load balancers, or handling SSL certificates?"**

**Answer:** Use platform-as-a-service (PaaS) providers that:
- Detect your Dockerfile automatically
- Build images in the cloud
- Provision managed databases
- Handle HTTPS and custom domains
- Scale on demand

**Railway** specializes in speed and developer experience  
**Render** specializes in stability and documentation

---

### What You'll Have After M3.2

By the end of M3.2, you will have:
- Two production deployments (Railway + Render)
- Public URLs with HTTPS enabled
- Automatic deployments on every git push
- Knowledge to choose the right platform for your needs
- Experience with modern cloud deployment workflows

---

### Estimated Time
- Video: 35 minutes
- Hands-on deployment and testing: 90 minutes
- Total: ~2 hours

---

### Before You Continue

**Verify all checks above passed:**
- ‚úÖ Docker compose health check
- ‚úÖ .env.example completeness
- ‚úÖ Git history secrets scan
- ‚úÖ (Optional) Staging configuration

**If any required checks failed, fix them before proceeding to M3.2!**

---

**See you in M3.2: Cloud Deployment (Railway/Render)!** üöÄ