# Chapter 31: Kubernetes Deployments

While Chapter 30 established deployment strategies conceptually, this chapter examines the Kubernetes-native mechanisms that implement these patterns. The Kubernetes Deployment resource provides declarative updates for Pods and ReplicaSets, automating the transition from desired state to actual state while maintaining application availability.

Understanding the Deployment controller's internal mechanics—from ReplicaSet management to revision history retention—is essential for troubleshooting stuck rollouts, implementing canary releases, and executing safe rollbacks in production environments.

## 31.1 Deployments Deep Dive

The Deployment controller manages the lifecycle of stateless applications, providing declarative rollouts, scaling, and rollback capabilities through abstraction layers that separate user intent from execution mechanics.

### Architecture and Relationships

```mermaid
graph TD
    A[Deployment] -->|manages| B[ReplicaSet v1]
    A -->|manages| C[ReplicaSet v2]
    B -->|owns| D[Pod 1.1]
    B -->|owns| E[Pod 1.2]
    C -->|owns| F[Pod 2.1]
    C -->|owns| G[Pod 2.2]
    
    style A fill:#f9f,stroke:#333,stroke-width:2px
    style B fill:#bbf,stroke:#333
    style C fill:#bbf,stroke:#333,stroke-dasharray: 5 5
```

**Deployment → ReplicaSet → Pod Hierarchy:**
- **Deployment**: User-facing abstraction defining desired state (image, replicas, strategy)
- **ReplicaSet**: Immutable snapshot of a specific pod template; created by Deployment during updates
- **Pod**: Actual running containers scheduled by ReplicaSet controller

### Deployment Specification

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
  namespace: production
  labels:
    app: payment-service
    version: v2.3.1
    tier: backend
spec:
  replicas: 5
  revisionHistoryLimit: 10  # Retain 10 old ReplicaSets for rollback
  progressDeadlineSeconds: 600  # 10 minutes to progress before marked failed
  minReadySeconds: 30  # Pod must be ready for 30s before considered available
  
  strategy:
    type: RollingUpdate  # or Recreate
    rollingUpdate:
      maxSurge: 25%      # Can exceed desired count by 25%
      maxUnavailable: 0  # Never drop below desired count
  
  selector:
    matchLabels:
      app: payment-service
    matchExpressions:
      - {key: tier, operator: In, values: [backend, api]}
  
  template:
    metadata:
      labels:
        app: payment-service
        version: v2.3.1
        tier: backend
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        deployment.kubernetes.io/revision: "3"
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - payment-service
              topologyKey: kubernetes.io/hostname
      
      containers:
      - name: payment-api
        image: registry.company.com/payment-service:v2.3.1
        imagePullPolicy: IfNotPresent
        
        ports:
        - name: http
          containerPort: 8080
          protocol: TCP
        
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        
        readinessProbe:  # Critical for rolling updates
          httpGet:
            path: /health/ready
            port: 8080
            httpHeaders:
            - name: Accept
              value: application/json
          initialDelaySeconds: 10
          periodSeconds: 5
          successThreshold: 2  # Must pass twice to be ready
          failureThreshold: 3
        
        livenessProbe:  # Restart if deadlocked
          httpGet:
            path: /health/live
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        
        startupProbe:  # For slow-starting applications
          httpGet:
            path: /health/started
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          failureThreshold: 30  # 30 * 5 = 150s max startup time
        
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: DEPLOYMENT_REVISION
          valueFrom:
            fieldRef:
              fieldPath: metadata.annotations['deployment.kubernetes.io/revision']
        
        volumeMounts:
        - name: tmp
          mountPath: /tmp
      
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 2000
        seccompProfile:
          type: RuntimeDefault
      
      volumes:
      - name: tmp
        emptyDir: {}
      
      terminationGracePeriodSeconds: 60  # Time for graceful shutdown
```

## 31.2 ReplicaSets

ReplicaSets ensure a specified number of pod replicas are running at any given time. While Deployments manage ReplicaSets, understanding their behavior is crucial for advanced operations.

### ReplicaSet Mechanics

```yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: payment-service-7c9b8f4d5  # Generated by Deployment
  ownerReferences:  # Links to parent Deployment
  - apiVersion: apps/v1
    kind: Deployment
    name: payment-service
    uid: a3f5c8d2-1234-5678-9abc-def012345678
    controller: true
    blockOwnerDeletion: true
spec:
  replicas: 5
  selector:
    matchLabels:
      app: payment-service
      pod-template-hash: 7c9b8f4d5  # Unique hash of pod template
  template:
    # Pod template identical to Deployment's template
    metadata:
      labels:
        app: payment-service
        pod-template-hash: 7c9b8f4d5
    spec:
      containers:
      - name: payment-api
        image: registry.company.com/payment-service:v2.3.1
```

**Key Behaviors:**
- **Immutable Templates**: Changing a ReplicaSet's pod template has no effect on existing pods (only affects new replicas)
- **Label Matching**: ReplicaSets select pods based on labels, not ownership—orphaned pods with matching labels are adopted
- **Hash Generation**: The `pod-template-hash` label ensures ReplicaSets manage only pods created from their specific template

### Manual ReplicaSet Management (Anti-Pattern)

While Deployments automate ReplicaSet management, manual intervention is occasionally required:

```bash
# View ReplicaSets created by Deployment
kubectl get rs -l app=payment-service

# Scale specific ReplicaSet (bypassing Deployment - not recommended)
kubectl scale rs payment-service-7c9b8f4d5 --replicas=10

# Orphan pods from ReplicaSet (remove labels)
kubectl label pod payment-service-xxx app- pod-template-hash-
```

**Warning**: Manual ReplicaSet manipulation breaks Deployment's declarative model. Use only for emergency debugging.

## 31.3 Rolling Update Configuration

Rolling updates replace pods gradually, ensuring zero downtime by maintaining availability throughout the transition.

### Update Mechanics

When a Deployment's pod template changes (image tag, labels, or spec), the controller:

1. Creates a new ReplicaSet with the updated template
2. Scales up the new ReplicaSet while scaling down the old one
3. Respects `maxSurge` and `maxUnavailable` constraints
4. Waits for new pods to become `Ready` (readiness probe) before terminating old pods

### Configuration Parameters

**maxSurge:**
- **Definition**: Maximum number of pods that can be created above the desired replica count during update
- **Values**: Absolute number or percentage (rounded up)
- **Default**: 25%
- **Example**: With `replicas: 10` and `maxSurge: 25%`, up to 13 pods may exist temporarily (10 desired + 3 surge)

**maxUnavailable:**
- **Definition**: Maximum number of pods that can be unavailable during update
- **Values**: Absolute number or percentage (rounded down)
- **Default**: 25%
- **Example**: With `replicas: 10` and `maxUnavailable: 0`, all 10 pods must remain available (requires surge capacity)

**Optimization Scenarios:**

```yaml
# Zero-downtime conservative (slower, safer)
strategy:
  rollingUpdate:
    maxSurge: 1        # Add one pod at a time
    maxUnavailable: 0  # Never drop below desired count
  type: RollingUpdate

# Fast rollout (acceptable brief capacity reduction)
strategy:
  rollingUpdate:
    maxSurge: 100%     # Double capacity temporarily
    maxUnavailable: 50%  # Allow half to be down
  type: RollingUpdate

# Recreate-equivalent (for incompatible changes)
strategy:
  type: Recreate  # Terminates all before creating new
```

### Monitoring Rollout Progress

```bash
# Watch rollout in real-time
kubectl rollout status deployment/payment-service --timeout=5m

# Detailed view of ReplicaSets
kubectl get rs -l app=payment-service -w

# Check pod distribution between versions
kubectl get pods -l app=payment-service -L pod-template-hash

# Events during rollout
kubectl get events --field-selector involvedObject.name=payment-service --watch
```

## 31.4 Revision History

Deployments maintain a history of ReplicaSets to enable rollback to previous versions. Each unique pod template generates a revision.

### Revision Tracking

```bash
# View revision history
kubectl rollout history deployment/payment-service

# Output:
# REVISION  CHANGE-CAUSE
# 1         kubectl apply --record --filename=payment-v1.yaml
# 2         kubectl set image deployment/payment-service payment-api=v2.0.0
# 3         kubectl apply --filename=payment-v2.3.1.yaml --record
```

**Change-Cause Annotation:**
To track why changes occurred, use the `--record` flag (deprecated in favor of annotations) or manual annotations:

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    kubernetes.io/change-cause: "Upgraded to v2.3.1 for CVE-2024-1234 fix"
```

### Revision Retention

```yaml
spec:
  revisionHistoryLimit: 5  # Keep 5 old ReplicaSets (default 10)
```

**Storage Considerations:**
- Each retained ReplicaSet consumes etcd storage
- Large clusters with frequent updates may need lower limits
- Setting to 0 prevents rollback but cleans up old ReplicaSets immediately

## 31.5 Rollback Procedures

When deployments fail, Kubernetes provides mechanisms to revert to previous stable versions quickly.

### Automated Rollback

```bash
# Rollback to previous revision
kubectl rollout undo deployment/payment-service

# Rollback to specific revision
kubectl rollout undo deployment/payment-service --to-revision=2

# Verify rollback
kubectl rollout status deployment/payment-service
kubectl get pods -l app=payment-service
```

**Rollback Mechanics:**
1. Deployment controller reverts pod template to specified revision
2. New ReplicaSet created with old template (becomes new "current")
3. Scale transition follows rolling update strategy
4. Failed revision ReplicaSet retained for analysis

### Rollback Verification

```bash
# Check if rollback completed
kubectl get deployment payment-service -o jsonpath='{.metadata.annotations.deployment\.kubernetes\.io/revision}'

# Compare current vs desired state
kubectl get deployment payment-service -o yaml | grep -A 5 "containerImage"

# View rollback events
kubectl describe deployment payment-service | grep -A 10 "Events"
```

### Database Rollback Considerations

Kubernetes rollbacks only revert application code, not database schemas:

```yaml
# Pre-rollback job for database compatibility check
apiVersion: batch/v1
kind: Job
metadata:
  name: pre-rollback-db-check
spec:
  template:
    spec:
      containers:
      - name: check
        image: migrate/migrate
        command:
        - sh
        - -c
        - |
          # Verify backward compatibility before rollback
          migrate -path /migrations -database "$DB_URL" validate
      restartPolicy: Never
```

## 31.6 Paused and Resumed Deployments

Pausing deployments enables manual canary validation and staged rollouts.

### Pause Mechanism

```bash
# Pause deployment mid-rollout
kubectl rollout pause deployment/payment-service

# Status shows "ProgressPaused"
kubectl get deployment payment-service -o jsonpath='{.status.conditions[?(@.type=="Progressing")].reason}'

# Resume after validation
kubectl rollout resume deployment/payment-service
```

**Use Cases:**
- **Canary Validation**: Deploy 1 pod, pause, test manually, then resume
- **Maintenance Windows**: Pause before peak traffic, resume during low-traffic period
- **Coordinated Changes**: Pause multiple deployments, resume simultaneously

### Scripted Canary with Pause

```bash
#!/bin/bash
# manual-canary.sh

DEPLOYMENT="payment-service"
NAMESPACE="production"

# Start rollout
kubectl set image deployment/$DEPLOYMENT payment-api=payment-service:v2.4.0 -n $NAMESPACE

# Wait for one pod to be ready
echo "Waiting for first canary pod..."
kubectl rollout pause deployment/$DEPLOYMENT -n $NAMESPACE

# Get canary pod name
CANARY_POD=$(kubectl get pods -n $NAMESPACE -l app=$DEPLOYMENT -o jsonpath='{.items[0].metadata.name}')

# Port-forward for manual testing
kubectl port-forward $CANARY_POD 8080:8080 -n $NAMESPACE &
PF_PID=$!

# Wait for user confirmation
read -p "Test canary at http://localhost:8080. Proceed? (y/n) " -n 1 -r
echo

kill $PF_PID

if [[ $REPLY =~ ^[Yy]$ ]]; then
  echo "Resuming rollout..."
  kubectl rollout resume deployment/$DEPLOYMENT -n $NAMESPACE
  kubectl rollout status deployment/$DEPLOYMENT -n $NAMESPACE
else
  echo "Rolling back..."
  kubectl rollout undo deployment/$DEPLOYMENT -n $NAMESPACE
fi
```

## 31.7 Deployment Status and Conditions

Deployments expose conditions that reflect their current state, essential for automated tooling and monitoring.

### Condition Types

**Available:** Minimum replicas available (Ready for minReadySeconds)
**Progressing:** ReplicaSet is scaling or pods are being updated
**ReplicaFailure:** Error creating/deleting pods

```yaml
status:
  conditions:
  - type: Available
    status: "True"
    lastUpdateTime: "2024-01-15T10:30:00Z"
    reason: MinimumReplicasAvailable
    message: Deployment has minimum availability.
  
  - type: Progressing
    status: "True"
    lastUpdateTime: "2024-01-15T10:35:00Z"
    reason: NewReplicaSetAvailable
    message: ReplicaSet "payment-service-7c9b8f4d5" has successfully progressed.
```

### Progress Deadline

```yaml
spec:
  progressDeadlineSeconds: 600  # 10 minutes
```

If deployment doesn't progress (new pods not becoming ready) within this window, status becomes `ProgressDeadlineExceeded`:

```bash
# Check for stuck deployments
kubectl get deployments --all-namespaces -o json | \
  jq '.items[] | select(.status.conditions[] | .reason == "ProgressDeadlineExceeded") | .metadata.name'
```

## 31.8 Advanced Deployment Controllers

Native Deployments support only RollingUpdate and Recreate strategies. For Blue/Green, Canary, or A/B testing, advanced controllers are required.

### Argo Rollouts

Argo Rollouts replaces Deployment with enhanced strategies:

```yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: payment-service
spec:
  replicas: 5
  strategy:
    canary:
      canaryService: payment-service-canary  # Separate service for canary
      stableService: payment-service-stable  # Stable service
      trafficRouting:
        istio:
          virtualService:
            name: payment-service-vs
            routes:
            - primary
      steps:
      - setWeight: 10
      - pause: {duration: 2m}  # Wait 2 minutes
      - setWeight: 25
      - analysis:
          templates:
          - templateName: success-rate
          args:
          - name: service-name
            value: payment-service
      - setWeight: 50
      - pause: {duration: 2m}
      - setWeight: 100
      analysis:
        threshold: 5  # Max failures
        interval: 1m
        successfulRunHistoryLimit: 10
```

**Key Features:**
- Automated metric analysis (Prometheus, Datadog, CloudWatch)
- Automated promotion/rollback based on SLOs
- Blue/Green and Canary strategies
- Integration with Ingress controllers and Service Meshes

### Flagger

Flagger automates canary deployments using Prometheus metrics:

```yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: payment-service
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payment-service
  service:
    port: 80
    targetPort: 8080
    gateways:
    - istio-gateway
    hosts:
    - api.company.com
  analysis:
    interval: 30s
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
    - name: request-success-rate
      thresholdRange:
        min: 99
      interval: 1m
    - name: request-duration
      thresholdRange:
        max: 500
      interval: 1m
    webhooks:
    - name: load-test
      url: http://flagger-loadtester.test/
      timeout: 5s
      metadata:
        cmd: "hey -z 1m -q 10 -c 2 http://payment-service-canary/"
```

## 31.9 Integration with Service Meshes

Service meshes enable fine-grained traffic splitting beyond what Kubernetes Services provide.

### Istio Integration

```yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: payment-service
spec:
  hosts:
  - payment-service.production.svc.cluster.local
  http:
  - match:
    - headers:
        end-user:
          exact: "canary-tester"
    route:
    - destination:
        host: payment-service
        subset: v2
      weight: 100
  - route:
    - destination:
        host: payment-service
        subset: v1
      weight: 90
    - destination:
        host: payment-service
        subset: v2
      weight: 10
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: payment-service
spec:
  host: payment-service
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 50
    outlierDetection:
      consecutiveErrors: 5
      interval: 30s
      baseEjectionTime: 30s
```

### Linkerd Integration

```yaml
apiVersion: split.smi-spec.io/v1alpha4
kind: TrafficSplit
metadata:
  name: payment-service-canary
spec:
  service: payment-service
  backends:
  - service: payment-service-v1
    weight: 90
  - service: payment-service-v2
    weight: 10
```

---

## Chapter Summary and Preview

In this chapter, we examined the Kubernetes Deployment controller as the foundational mechanism for stateless application management. We detailed the hierarchical relationship where Deployments manage ReplicaSets, which in turn manage Pods, enabling the immutable infrastructure pattern central to Kubernetes operations. The RollingUpdate strategy configuration through `maxSurge` and `maxUnavailable` parameters allows fine-tuning of the availability-vs-speed trade-off, with zero-downtime deployments requiring careful probe configuration—particularly readiness probes that prevent premature traffic routing to initializing containers. Revision history retention enables rapid rollbacks to previous stable states, though operators must remember that Kubernetes rollbacks affect only application code, not database schemas, requiring backward-compatible migration strategies. We explored the pause/resume functionality for manual canary validation and staged rollouts, alongside the progress deadline mechanism that detects stuck deployments. For advanced patterns beyond native capabilities, we introduced Argo Rollouts and Flagger as controllers enabling automated canary analysis and Blue/Green deployments, and demonstrated service mesh integration with Istio and Linkerd for sophisticated traffic splitting and resilience patterns.

**Key Takeaways:**
- Configure readiness probes correctly—without them, Kubernetes considers pods ready immediately on container start, causing 502 errors during rolling updates as traffic routes to initializing applications
- Set `maxUnavailable: 0` for critical services to ensure zero-downtime deployments, accepting the infrastructure cost of `maxSurge` capacity requirements
- Retain revision history (default 10) for rapid rollback capability, but monitor etcd storage usage in large clusters with frequent deployments
- Use `kubectl rollout pause` to implement manual canary releases with native Deployments, validating a single new pod before permitting full rollout
- Deployments only manage stateless applications; for stateful workloads requiring ordered deployment, use StatefulSets (Chapter 16)
- Integrate Argo Rollouts or Flagger for production canary deployments requiring automated metric-based promotion/rollback rather than manual verification

**Next Chapter Preview:**
Chapter 32: Helm - Kubernetes Package Manager introduces templating and packaging for Kubernetes manifests. We will explore Helm charts as reusable, configurable deployment packages, examining values files for environment-specific configuration, chart dependencies for complex application stacks, and Helm hooks for pre/post-deployment automation. This chapter bridges the gap between raw Kubernetes manifests and production-grade deployment automation, enabling the sophisticated deployment patterns described in this chapter to be packaged, versioned, and distributed as reusable components across multiple environments and teams.