# Chapter 47: CI/CD for Databases

Continuous Integration and Continuous Deployment for databases requires stricter safety guarantees than application code due to the stateful nature of data. While application deployments can be rolled back by reverting to a previous container image, database changes are permanent once committed. This chapter establishes industry-standard patterns for validating, testing, and deploying database changes with zero-downtime guarantees and deterministic rollback procedures.

## 47.1 Migration Validation and Testing in CI

### 47.1.1 Pre-deployment Validation Pipeline

Before any migration reaches production, it must pass through a multi-stage validation pipeline that checks syntax, conflicts, and compatibility.

```yaml
# .github/workflows/database-ci.yml
name: Database CI Validation

on:
  pull_request:
    paths:
      - 'migrations/**'
      - 'schema/**'
      - '.github/workflows/database-ci.yml'

jobs:
  validate-migrations:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16-alpine
        env:
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: migration_test
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full history for conflict detection

      - name: Setup PostgreSQL Client
        run: |
          sudo apt-get update
          sudo apt-get install -y postgresql-client

      - name: Check Migration Naming Convention
        run: |
          # Enforce timestamp prefixes to prevent ordering conflicts
          for file in migrations/*.sql; do
            if [[ ! "$(basename "$file")" =~ ^[0-9]{14}_.*\.sql$ ]]; then
              echo "ERROR: $file does not follow naming convention YYYYMMDDHHMMSS_description.sql"
              exit 1
            fi
          done
          echo "✓ Migration naming convention validated"

      - name: Detect Migration Conflicts
        run: |
          # Check if two branches added migrations with same timestamp prefix
          git fetch origin main
          MAIN_MIGRATIONS=$(git ls-tree -r --name-only origin/main migrations/ | sort)
          PR_MIGRATIONS=$(ls -1 migrations/*.sql 2>/dev/null | sort)
          
          # Check for duplicate timestamps
          echo "$PR_MIGRATIONS" | while read file; do
            timestamp=$(basename "$file" | cut -d'_' -f1)
            matches=$(echo "$MAIN_MIGRATIONS" | grep "^migrations/${timestamp}" || true)
            if [ ! -z "$matches" ]; then
              echo "ERROR: Migration timestamp collision detected"
              echo "Main branch has: $matches"
              echo "PR branch has: $file"
              exit 1
            fi
          done

      - name: Syntax Validation (Dry Run)
        env:
          PGPASSWORD: postgres
        run: |
          # Test all migrations in transaction that rolls back
          psql -h localhost -U postgres -d migration_test << 'EOF'
            BEGIN;
            -- Set exit on error
            \set ON_ERROR_STOP on
            
            -- Apply all migrations
            \i migrations/001_initial_schema.sql
            \i migrations/002_add_user_indexes.sql
            -- ... etc
            
            -- Verify no syntax errors occurred
            SELECT 'All migrations valid' as status;
            
            -- Rollback everything (dry run)
            ROLLBACK;
          EOF
```

**Validation Checklist:**

1. **Naming Convention**: Enforce `YYYYMMDDHHMMSS_description.sql` to ensure ordering and prevent merge conflicts
2. **Idempotency Check**: Verify `IF NOT EXISTS` or `CREATE OR REPLACE` where applicable
3. **Transaction Safety**: Ensure DDL statements are transactional (PostgreSQL supports transactional DDL)
4. **Conflict Detection**: Prevent two developers from creating migrations with identical timestamps
5. **Syntax Validation**: Parse SQL without executing permanently (dry run in transaction)

### 47.1.2 Shadow Database Testing

Shadow databases validate migrations against production-like data volumes and schemas without affecting production.

```bash
#!/bin/bash
# scripts/shadow-migration-test.sh

set -euo pipefail

# Configuration
PROD_DUMP_S3="s3://backups/production/anonymized-latest.dump"
SHADOW_DB="shadow_test_$(date +%s)"
SHADOW_HOST="${SHADOW_HOST:-localhost}"
SHADOW_USER="postgres"

echo "🔄 Creating shadow database: $SHADOW_DB"

# Create temporary database
psql -h $SHADOW_HOST -U $SHADOW_USER -c "CREATE DATABASE $SHADOW_DB;"

cleanup() {
    echo "🧹 Cleaning up shadow database..."
    psql -h $SHADOW_HOST -U $SHADOW_USER -c "DROP DATABASE IF EXISTS $SHADOW_DB;"
}
trap cleanup EXIT

# Restore production schema (anonymized)
echo "📥 Restoring production schema..."
pg_restore -h $SHADOW_HOST -U $SHADOW_USER -d $SHADOW_DB --schema-only \
    <(aws s3 cp $PROD_DUMP_S3 -) 2>/dev/null || true

# Run migrations and capture timing
echo "🚀 Running migrations..."
START_TIME=$(date +%s)

if ! psql -h $SHADOW_HOST -U $SHADOW_USER -d $SHADOW_DB \
    --single-transaction \
    --file=migrations/all_migrations.sql; then
    
    echo "❌ Migration failed on shadow database"
    exit 1
fi

END_TIME=$(date +%s)
DURATION=$((END_TIME - START_TIME))

echo "✅ Migrations completed in ${DURATION}s"

# Performance assertions
if [ $DURATION -gt 300 ]; then
    echo "⚠️ WARNING: Migrations took >5 minutes. Consider breaking into smaller transactions."
fi

# Verify constraints are valid (not deferred failures)
echo "🔍 Validating constraints..."
psql -h $SHADOW_HOST -U $SHADOW_USER -d $SHADOW_DB -c "
    SELECT conname, contype 
    FROM pg_constraint 
    WHERE convalidated = false;
" | grep -q "0 rows" || {
    echo "❌ Unvalidated constraints detected"
    exit 1
}
```

### 47.1.3 Backward Compatibility Enforcement

Database changes must maintain backward compatibility during deployment to support rolling application updates (where old and new code run simultaneously).

```yaml
# .github/workflows/backward-compat-check.yml
name: Backward Compatibility

on: [pull_request]

jobs:
  check-breaking-changes:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Check for Breaking Changes
        run: |
          # List of breaking change patterns
          BREAKING_PATTERNS=(
            "DROP TABLE"
            "DROP COLUMN"
            "ALTER TABLE.*DROP"
            "RENAME COLUMN"
            "RENAME TABLE"
            "ALTER.*TYPE.*USING"  # Type changes that fail on old data
            "NOT NULL.*ADD"       # Adding NOT NULL without default
          )
          
          EXIT_CODE=0
          
          for file in $(git diff --name-only origin/main | grep '\.sql$'); do
            echo "Checking $file..."
            
            for pattern in "${BREAKING_PATTERNS[@]}"; do
              if grep -iE "$pattern" "$file" > /dev/null 2>&1; then
                echo "❌ BREAKING CHANGE detected in $file:"
                grep -n -iE "$pattern" "$file"
                echo ""
                echo "Remediation:"
                echo "- Use 'Expand and Contract' pattern for column drops"
                echo "- Add new columns as NULL first, populate, then add constraint"
                echo "- Create new table/column, migrate data, drop old in separate release"
                EXIT_CODE=1
              fi
            done
          done
          
          exit $EXIT_CODE

      - name: Verify Migration Order Safety
        run: |
          # Ensure new migrations don't reference objects created in same PR
          # unless they are in the same file
          echo "Checking cross-migration dependencies..."
```

**Backward Compatibility Patterns:**

| Change Type | Breaking? | Safe Approach |
|-------------|-----------|---------------|
| Add column | No | Add as nullable or with default |
| Add `NOT NULL` | Yes | Add nullable → Backfill → Add constraint in next PR |
| Drop column | Yes | Stop using in app → Deploy → Drop column in next PR |
| Rename column | Yes | Add new column → Dual write → Migrate → Drop old |
| Change type | Yes | Add new column → Migrate → Update app → Drop old |
| Drop table | Yes | Stop writes → Archive data → Drop in next PR |
| Add index | No (mostly) | Use `CONCURRENTLY` to avoid locking |
| Add FK constraint | Yes (if existing data invalid) | Validate data first, add `NOT VALID` then validate separately |

## 47.2 SQL Linting and Formatting

### 47.2.1 SQLFluff Configuration

SQLFluff is the industry-standard linter for SQL, supporting PostgreSQL dialect and custom rule configurations.

```ini
# .sqlfluff
[sqlfluff]
dialect = postgres
templater = jinja
runaway_limit = 10
max_line_length = 88

[sqlfluff:indentation]
tab_width = 4
indented_joins = false
indented_using_on = true
indented_ctes = false

[sqlfluff:layout:type:comma]
line_position = trailing

[sqlfluff:rules]
exclude_rules = L016,L031,L034  # Exclude specific rules if needed

[sqlfluff:rules:aliasing.table]
aliasing = explicit  # Require 'AS' keyword

[sqlfluff:rules:aliasing.column]
aliasing = explicit

[sqlfluff:rules:capitalisation.keywords]
capitalisation_policy = upper

[sqlfluff:rules:capitalisation.identifiers]
capitalisation_policy = lower

[sqlfluff:rules:capitalisation.functions]
extended_capitalisation_policy = upper

[sqlfluff:rules:convention.select_trailing_comma]
select_clause_trailing_comma = forbid

[sqlfluff:rules:convention.quoted_literals]
preferred_quoted_literal_style = single_quotes

[sqlfluff:rules:convention.casting_style]
preferred_type_casting_style = cast

[sqlfluff:rules:structure.subquery]
forbid_subquery_in = join
```

**GitHub Actions Integration:**

```yaml
# .github/workflows/sql-lint.yml
name: SQL Lint

on: [pull_request]

jobs:
  lint-sql:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          
      - name: Install SQLFluff
        run: pip install sqlfluff==2.3.0
        
      - name: Lint SQL Files
        run: |
          sqlfluff lint migrations/ schema/ --format github-annotation \
            --annotation-level failure
          
      - name: Check Formatting (diff)
        run: |
          sqlfluff fix migrations/ schema/ --dry-run --diff
          if [ $? -ne 0 ]; then
            echo "❌ SQL files need formatting. Run: sqlfluff fix migrations/"
            exit 1
          fi
```

### 47.2.2 Naming Convention Enforcement

Consistent naming prevents confusion and enables automated tooling.

```yaml
# .sqlfluff - Naming convention rules
[sqlfluff:rules:convention.casing_columns]
# Custom rule (requires plugin or custom check)
# Tables: plural, snake_case
# Columns: singular, snake_case
# Indexes: idx_{table}_{columns}
# Constraints: {type}_{table}_{columns} (pk_, fk_, chk_, uq_)

[sqlfluff:rules:convention.naming.table]
pattern = ^[a-z][a-z0-9_]*s$  # Plural, starts with lowercase

[sqlfluff:rules:convention.naming.columns]
pattern = ^[a-z][a-z0-9_]*$  # Singular implied by context
```

**Custom Naming Check Script:**

```bash
#!/bin/bash
# scripts/check-naming-conventions.sh

check_naming() {
    local file=$1
    local errors=0
    
    # Check table names (should be plural)
    while IFS= read -r line; do
        if [[ $line =~ CREATE[[:space:]]+TABLE[[:space:]]+([a-z_]+) ]]; then
            table="${BASH_REMATCH[1]}"
            # Check if ends with 's' (simple plural check)
            if [[ ! $table =~ s$ ]]; then
                echo "ERROR: Table '$table' should be plural (e.g., ${table}s)"
                ((errors++))
            fi
        fi
    done < "$file"
    
    # Check index naming
    while IFS= read -r line; do
        if [[ $line =~ CREATE[[:space:]]+INDEX[[:space:]]+([a-z_]+) ]]; then
            idx="${BASH_REMATCH[1]}"
            if [[ ! $idx =~ ^idx_ ]]; then
                echo "ERROR: Index '$idx' should start with 'idx_'"
                ((errors++))
            fi
        fi
    done < "$file"
    
    # Check foreign key naming
    while IFS= read -r line; do
        if [[ $line =~ CONSTRAINT[[:space:]]+([a-z_]+)[[:space:]]+FOREIGN[[:space:]]+KEY ]]; then
            fk="${BASH_REMATCH[1]}"
            if [[ ! $fk =~ ^fk_ ]]; then
                echo "ERROR: FK constraint '$fk' should start with 'fk_'"
                ((errors++))
            fi
        fi
    done < "$file"
    
    return $errors
}

# Run on all SQL files
total_errors=0
for file in migrations/*.sql; do
    if ! check_naming "$file"; then
        ((total_errors++))
    fi
done

exit $total_errors
```

## 47.3 Schema Drift Detection

### 47.3.1 Automated Drift Monitoring

Schema drift occurs when manual changes (hotfixes, emergency DBA interventions) modify production schema outside the migration pipeline.

```python
# scripts/drift_detector.py
import subprocess
import json
import sys
from dataclasses import dataclass
from typing import List, Optional
import psycopg2

@dataclass
class SchemaObject:
    type: str
    name: str
    definition: str

def get_expected_schema(migration_files: List[str]) -> List[SchemaObject]:
    """Apply migrations to temporary database and extract schema"""
    # This would use a temporary Docker container or ephemeral DB
    # For brevity, showing conceptual implementation
    pass

def get_actual_schema(conn_string: str) -> List[SchemaObject]:
    """Extract current schema from production"""
    conn = psycopg2.connect(conn_string)
    cur = conn.cursor()
    
    objects = []
    
    # Get tables
    cur.execute("""
        SELECT schemaname, tablename, 
               pg_catalog.pg_get_ddl(schemaname, tablename)
        FROM pg_tables 
        WHERE schemaname = 'public'
    """)
    for row in cur.fetchall():
        objects.append(SchemaObject('table', row[1], row[2]))
    
    # Get indexes
    cur.execute("""
        SELECT schemaname, indexname, indexdef
        FROM pg_indexes
        WHERE schemaname = 'public'
    """)
    for row in cur.fetchall():
        objects.append(SchemaObject('index', row[1], row[2]))
    
    # Get constraints
    cur.execute("""
        SELECT conname, pg_get_constraintdef(oid)
        FROM pg_constraint
        WHERE connamespace = 'public'::regnamespace
    """)
    for row in cur.fetchall():
        objects.append(SchemaObject('constraint', row[0], row[1]))
    
    return objects

def detect_drift(expected: List[SchemaObject], actual: List[SchemaObject]) -> dict:
    drift = {
        'missing_in_prod': [],  # In migrations but not production
        'extra_in_prod': [],    # In production but not migrations
        'modified': []          # Definition differs
    }
    
    expected_dict = {f"{o.type}:{o.name}": o for o in expected}
    actual_dict = {f"{o.type}:{o.name}": o for o in actual}
    
    # Check for missing objects
    for key, obj in expected_dict.items():
        if key not in actual_dict:
            drift['missing_in_prod'].append(obj)
        elif actual_dict[key].definition != obj.definition:
            drift['modified'].append({
                'name': key,
                'expected': obj.definition,
                'actual': actual_dict[key].definition
            })
    
    # Check for extra objects
    for key, obj in actual_dict.items():
        if key not in expected_dict:
            drift['extra_in_prod'].append(obj)
    
    return drift

if __name__ == "__main__":
    # Run in CI against staging/production
    drift = detect_drift(
        get_expected_schema(["migrations/"]),
        get_actual_schema(os.environ['DATABASE_URL'])
    )
    
    if any(drift.values()):
        print("❌ Schema drift detected!")
        print(json.dumps(drift, indent=2))
        sys.exit(1)
    else:
        print("✅ Schema matches migrations")
```

### 47.3.2 Remediation Strategies

When drift is detected, remediation must be careful to avoid data loss.

```bash
# scripts/reconcile-drift.sh
# Emergency script to bring production back to expected state

set -e

DRIFT_REPORT="drift-report.json"
BACKUP_DIR="backups/$(date +%Y%m%d_%H%M%S)"

echo "Creating safety backup..."
mkdir -p $BACKUP_DIR
pg_dump $DATABASE_URL --schema-only > $BACKUP_DIR/schema_before.sql
pg_dump $DATABASE_URL --data-only --format=custom > $BACKUP_DIR/data.dump

# Strategy 1: Extra objects in production (safe to drop if confirmed unused)
echo "Checking for extra objects..."
# Manual review required - never auto-drop in production

# Strategy 2: Missing objects (apply missing migrations only)
echo "Applying missing migrations..."
# Run only specific migrations that are missing, not all

# Strategy 3: Modified definitions (most dangerous)
echo "Modified objects detected:"
# Require manual ALTER statements to reconcile
# Example: If column type differs, create migration to alter
```

## 47.4 Deployment Strategies and Release Playbooks

### 47.4.1 Blue/Green Database Deployments

Blue/green deployment minimizes downtime by running two identical environments and switching traffic atomically.

```yaml
# Deployment architecture
# Blue (Current): Active production
# Green (New): New version being prepared
# Migration Strategy:
# 1. Clone Blue to Green
# 2. Apply migrations to Green
# 3. Sync data changes from Blue to Green (using logical replication)
# 4. Switch traffic to Green
# 5. Keep Blue as instant rollback

# docker-compose.blue-green.yml
version: "3.8"

services:
  postgres-blue:
    image: postgres:15-alpine  # Current version
    volumes:
      - blue_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"  # Current production
  
  postgres-green:
    image: postgres:16-alpine  # New version (if major upgrade)
    volumes:
      - green_data:/var/lib/postgresql/data
    ports:
      - "5433:5432"  # New instance, different port
  
  # Logical replication for zero-downtime cutover
  pglogical:
    image: pglogical/pglogical:latest
    environment:
      SOURCE_DB: postgres://user:pass@postgres-blue:5432/app
      TARGET_DB: postgres://user:pass@postgres-green:5432/app
```

**Cutover Procedure:**

```bash
#!/bin/bash
# scripts/blue-green-cutover.sh

set -e

BLUE_HOST="postgres-blue"
GREEN_HOST="postgres-green"
DATABASE="app"

echo "Phase 1: Preparing Green environment"
# Apply migrations to Green (offline, no traffic)
psql -h $GREEN_HOST -d $DATABASE -f migrations/pending.sql

echo "Phase 2: Setting up replication"
# Using pglogical or native logical replication
psql -h $BLUE_HOST -d $DATABASE -c "
    SELECT pglogical.create_subscription(
        subscription_name := 'sync_to_green',
        provider_dsn := 'host=$GREEN_HOST dbname=$DATABASE'
    );
"

echo "Phase 3: Waiting for replication lag..."
until [ "$(psql -h $BLUE_HOST -d $DATABASE -t -c "SELECT pglogical.wait_for_subscription_sync_complete('sync_to_green');" | xargs)" = "true" ]; do
    echo "Waiting for sync..."
    sleep 5
done

echo "Phase 4: Switching traffic (The Cutover)"
# 1. Set Blue to read-only (optional, for absolute safety)
psql -h $BLUE_HOST -d $DATABASE -c "ALTER DATABASE $DATABASE SET default_transaction_read_only = on;"

# 2. Final sync check
psql -h $BLUE_HOST -d $DATABASE -c "SELECT pglogical.wait_for_subscription_sync_complete('sync_to_green');"

# 3. Update connection pooler (PgBouncer) to point to Green
#    Or update DNS/Service Discovery
./update-connection-target.sh $GREEN_HOST

# 4. Verify Green is receiving traffic
sleep 5
if ./health-check.sh $GREEN_HOST; then
    echo "✅ Cutover successful"
    
    # 5. Keep Blue running but read-only for safety period (1 hour)
    (sleep 3600 && docker-compose stop postgres-blue) &
else
    echo "❌ Cutover failed, rolling back..."
    ./update-connection-target.sh $BLUE_HOST
    exit 1
fi
```

### 47.4.2 Expand and Contract Pattern

The only safe way to modify schema without downtime is the expand-contract pattern: expand (add new), migrate (dual write), contract (remove old).

```sql
-- Example: Rename column 'email' to 'email_address'

-- Step 1: EXPAND - Add new column (Deploy 1)
ALTER TABLE users ADD COLUMN email_address TEXT;
CREATE INDEX CONCURRENTLY idx_users_email_address ON users(email_address);

-- Step 2: DUAL WRITE - Application writes to both (Deploy 2)
-- Application code:
-- INSERT INTO users (email, email_address) VALUES (?, ?)
-- UPDATE users SET email = ?, email_address = ? WHERE id = ?

-- Step 3: BACKFILL - Migrate existing data (Background job)
UPDATE users 
SET email_address = email 
WHERE email_address IS NULL 
  AND id > $last_processed_id;  -- Batch processing

-- Step 4: SWITCH READS - Start reading from new column (Deploy 3)
-- Application code:
-- SELECT email_address as email FROM users

-- Step 5: CONTRACT - Remove old column (Deploy 4)
-- After confirming no old code references 'email'
ALTER TABLE users DROP COLUMN email;
```

### 47.4.3 Feature Flags with Database Schema

Schema changes can be hidden behind feature flags to enable gradual rollout.

```sql
-- Add new feature schema behind flag
CREATE TABLE IF NOT EXISTS new_feature_data (
    id UUID PRIMARY KEY,
    user_id UUID REFERENCES users(user_id),
    data JSONB
);

-- Application checks flag before using new table
-- if (featureFlags.isEnabled('new-billing')) {
--     use new_feature_data table
-- } else {
--     use old billing table
-- }

-- Migration safety: New tables don't affect old code
-- Can be dropped if feature is cancelled
```

## 47.5 Rollback and Disaster Recovery

### 47.5.1 Roll-Forward vs Rollback Decision Matrix

| Scenario | Strategy | Implementation |
|----------|----------|----------------|
| Migration fails mid-way | Rollback | Transaction rolls back automatically (if single transaction) |
| Migration succeeds but app fails | Rollback | Run `down` migration or restore from backup |
| Data corruption detected | Roll-forward | Fix data with new migration, don't rollback (data loss) |
| Performance regression | Roll-forward | Add indexes, optimize queries in hotfix |
| Schema incompatible with old app | Roll-forward | Deploy app fix, schema stays |

**Golden Rule**: Never rollback a migration that has been running in production for >X minutes (where X is your backup RPO), as you may lose data created since deployment.

### 47.5.2 Migration Downgrade Procedures

Always maintain `down` migrations, but test them thoroughly.

```python
# Alembic downgrade example (Python)
def downgrade():
    # Dangerous: This loses data added to new column
    op.drop_column('users', 'new_column')
    
    # Safer: Keep column, just stop using it (soft delete)
    # Then remove in later release after data archived

# Better approach: Reversible migrations
def upgrade():
    op.add_column('users', sa.Column('tier', sa.String(), nullable=True))
    op.execute("UPDATE users SET tier = 'free'")
    op.alter_column('users', 'tier', nullable=False)

def downgrade():
    # Restore previous state
    op.add_column('users', sa.Column('plan_type', sa.String(), nullable=True))
    op.execute("UPDATE users SET plan_type = tier")  # Migrate data back
    op.drop_column('users', 'tier')
```

### 47.5.3 Emergency Runbooks

```markdown
# Database Emergency Runbook

## Scenario 1: Migration Stuck/Locking Tables

1. **Identify blocking queries**:
   ```sql
   SELECT * FROM pg_stat_activity 
   WHERE state = 'active' 
     AND query LIKE '%ALTER TABLE%';
   ```

2. **If safe, cancel migration**:
   ```sql
   SELECT pg_cancel_backend(pid);
   -- If that fails:
   SELECT pg_terminate_backend(pid);
   ```

3. **Check lock status**:
   ```sql
   SELECT * FROM pg_locks WHERE NOT granted;
   ```

4. **Decision**:
   - If transaction rolled back: Retry with `LOCK TIMEOUT` set
   - If transaction committed partially: **Do not rollback**, assess data manually

## Scenario 2: Accidental Data Loss

1. **Immediate**: Stop all writes to table
   ```sql
   ALTER TABLE critical_table DISABLE TRIGGER ALL;
   ```

2. **Assess**: Determine time of incident
   ```sql
   -- Check Point-in-Time Recovery capability
   SHOW archive_mode;
   ```

3. **Restore**: Create clone from PITR
   ```bash
   # Create new instance from backup to specific time
   pg_restore --target-time "2024-01-15 14:30:00" ...
   ```

4. **Reconcile**: Compare and merge data, don't just overwrite
```

## 47.6 Pipeline Security and Governance

### 47.6.1 Database Access in CI

Never use long-lived credentials in CI pipelines.

```yaml
# .github/workflows/deploy.yml
jobs:
  deploy-database:
    runs-on: ubuntu-latest
    permissions:
      id-token: write  # For OIDC
      contents: read
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::ACCOUNT:role/DatabaseDeployRole
          aws-region: us-east-1
      
      - name: Get Temporary Database Credentials
        run: |
          # Generate temporary credentials via IAM auth or Secrets Manager
          CREDS=$(aws secretsmanager get-secret-value \
            --secret-id prod/db/deploy-credentials \
            --query SecretString --output text)
          
          echo "::add-mask::$CREDS"  # Mask in logs
          echo "DB_CREDS=$CREDS" >> $GITHUB_ENV
      
      - name: Run Migrations
        run: |
          # Credentials auto-expire after 15 minutes
          echo "$DB_CREDS" | jq -r '.password' | \
            psql -h $DB_HOST -U $(echo $DB_CREDS | jq -r '.username') -d app -f migrations/deploy.sql
```

### 47.6.2 Approval Gates

Database changes to production should require human approval.

```yaml
# .github/workflows/deploy.yml
jobs:
  preview-changes:
    runs-on: ubuntu-latest
    steps:
      - name: Generate Migration Preview
        run: |
          sqitch deploy --to-target HEAD --verify-only --log-only > migration-preview.sql
          echo "### Migration Preview" >> $GITHUB_STEP_SUMMARY
          echo '```sql' >> $GITHUB_STEP_SUMMARY
          cat migration-preview.sql >> $GITHUB_STEP_SUMMARY
          echo '```' >> $GITHUB_STEP_SUMMARY
  
  deploy:
    needs: preview-changes
    environment: production  # Requires manual approval in GitHub
    runs-on: ubuntu-latest
    steps:
      - name: Deploy
        run: sqitch deploy
```

---

## Chapter Summary

In this chapter, you learned:

1. **Migration Validation**: Implement multi-stage CI pipelines that check naming conventions (timestamps), detect merge conflicts, validate syntax via dry runs, and test against shadow databases with production-like data volumes; never allow migrations with duplicate timestamps or breaking changes without explicit "expand-contract" documentation.

2. **SQL Linting**: Configure SQLFluff with PostgreSQL dialect to enforce consistent casing (UPPER keywords, lowercase identifiers), trailing commas, and explicit aliasing; integrate linting into pre-commit hooks and CI checks to prevent style violations from reaching main branch.

3. **Drift Detection**: Deploy automated monitors that compare actual production schema against migration-defined expected state; detect manual hotfixes (extra objects, modified definitions) and alert immediately; maintain reconciliation runbooks that prioritize data preservation over schema purity.

4. **Deployment Strategies**: Implement blue/green deployments using logical replication for zero-downtime major version upgrades; use expand-contract patterns for all breaking changes (add new column → dual write → switch reads → drop old); hide schema changes behind feature flags to enable gradual rollouts and instant rollback of application logic without reverting schema.

5. **Rollback Procedures**: Prefer roll-forward (fixing data with new migrations) over rollback for any change live >15 minutes to prevent data loss; maintain tested `down` migrations but treat them as emergency-only; create explicit runbooks for stuck migrations (cancel backends, set lock timeouts) and data recovery (PITR clones, table disabling).

6. **Pipeline Security**: Use OIDC and temporary credentials (15-minute expiry) for database access in CI rather than static passwords; implement approval gates for production deployments that require human review of migration previews; maintain audit trails of who deployed what and when via structured logging in migration tools.

---

**Next:** In Chapter 48, we will explore Documentation and Standards—the "handbook within the handbook"—covering SQL style guides, schema review checklists, query review procedures, and operational runbook templates that ensure organizational consistency and knowledge retention.

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='46. database_testing_strategies.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='48. documentation_and_standards.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
