# Production Deployment Patterns

This notebook provides a comprehensive guide to deploying the Self-Critique pipeline to a production environment using Docker and Kubernetes. It covers best practices for containerization, configuration, scaling, and ensuring high availability.

## Learning Objectives

- **Containerization**: Learn multi-stage Docker build strategies for smaller, more secure images.
- **Kubernetes Manifests**: Understand how to define deployments, services, and configurations.
- **Secrets Management**: Implement secure handling of sensitive data like API keys.
- **High Availability**: Configure rolling updates, health probes, and autoscaling.
- **Infrastructure as Code**: Manage infrastructure declaratively for reproducibility.

---


## Section 1: Containerization Strategies (Dockerfile)

A multi-stage Dockerfile creates a lean, production-ready image by separating the build environment from the runtime environment. This reduces the image size and attack surface.


In [None]:
%%writefile ../../Dockerfile.prod
# Stage 1: Build stage with development dependencies
FROM python:3.9-slim as builder

WORKDIR /app

# Install build dependencies
RUN pip install --upgrade pip

# Copy requirements and install dependencies
COPY requirements.txt .
RUN pip wheel --no-cache-dir --wheel-dir /app/wheels -r requirements.txt

# Stage 2: Final production stage
FROM python:3.9-slim

WORKDIR /app

# Create a non-root user
RUN useradd --create-home appuser
USER appuser

# Copy wheels from builder stage and install
COPY --from=builder /app/wheels /wheels
RUN pip install --no-cache /wheels/*

# Copy application code
COPY . .

# Health check
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

EXPOSE 8000

# Run the application
CMD ["uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "8000"]


## Section 2: Kubernetes Manifests

These YAML files define the desired state for our application in Kubernetes.

### 2.1 ConfigMap (for non-sensitive configuration)


In [None]:
%%writefile ../../k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: self-critique-pipeline-config
data:
  DEFAULT_MODEL: "claude-sonnet-4-20250514"
  LOG_LEVEL: "info"


### 2.2 Secret (for sensitive data)

**Note**: In a real production environment, use a more secure method like HashiCorp Vault, AWS Secrets Manager, or Sealed Secrets. For this example, we create a Kubernetes secret directly.


In [None]:
# Create the secret from the command line (replace with your actual key)
# kubectl create secret generic anthropic-api-key --from-literal=api-key='YOUR_API_KEY_HERE'


### 2.3 Deployment (manages pods and updates)


In [None]:
%%writefile ../../k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: self-critique-pipeline
spec:
  replicas: 3
  selector:
    matchLabels:
      app: self-critique-pipeline
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  template:
    metadata:
      labels:
        app: self-critique-pipeline
    spec:
      containers:
      - name: api
        image: your-repo/self-critique-pipeline:v1.0.0
        ports:
        - containerPort: 8000
        envFrom:
        - configMapRef:
            name: self-critique-pipeline-config
        env:
        - name: ANTHROPIC_API_KEY
          valueFrom:
            secretKeyRef:
              name: anthropic-api-key
              key: api-key
        resources:
          requests:
            cpu: "250m"
            memory: "512Mi"
          limits:
            cpu: "1000m"
            memory: "2Gi"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 15
          periodSeconds: 20
        readinessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 10


### 2.4 Service (exposes the deployment)


In [None]:
%%writefile ../../k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: self-critique-pipeline-svc
spec:
  type: ClusterIP
  selector:
    app: self-critique-pipeline
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8000


## Section 3: Horizontal Pod Autoscaler (HPA)

The HPA automatically scales the number of pods in the deployment based on CPU utilization or other custom metrics.


In [None]:
%%writefile ../../k8s/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: self-critique-pipeline-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: self-critique-pipeline
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80


## Section 4: Applying the Manifests

To deploy the application, you would use `kubectl`:

```bash
# Apply the configuration and secrets first
kubectl apply -f k8s/configmap.yaml
# (ensure secret is created)

# Apply the application manifests
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
kubectl apply -f k8s/hpa.yaml

# Check the status
kubectl get all
```