Skip to content

K8s ‐ Verify statefulsets readiness

yogeshahiray edited this page Apr 16, 2026 · 1 revision

K8s - Verify all statefulsets are ready

Description

This rule validates that all StatefulSets across all namespaces are in a ready state with the correct number of replicas running. StatefulSets are used for stateful applications that require stable network identities, persistent storage, and ordered deployment and scaling.

Prerequisites

  • Access to the OpenShift cluster with permissions to list StatefulSets across all namespaces
  • The oc command-line tool configured and authenticated

Impact

StatefulSets that are not ready can lead to:

  • Data unavailability: Databases and storage systems may be inaccessible
  • Application failures: Services depending on stateful components will fail
  • Lost monitoring/alerting: Prometheus, Alertmanager, and other monitoring tools may be down
  • Cluster instability: Critical infrastructure components may be affected
  • Data corruption risk: Incomplete replicas during updates could risk data integrity
  • Split-brain scenarios: Distributed systems without sufficient replicas may experience inconsistent state

StatefulSets often manage critical infrastructure components, making their health essential for overall cluster stability.

Root Cause

Common scenarios that may cause StatefulSets to not be ready include:

  1. Persistent storage issues

    • PersistentVolumeClaim (PVC) binding failures
    • StorageClass unavailability or misconfiguration
    • Insufficient storage capacity
    • PersistentVolume provisioning failures
    • Volume attachment errors
    • Storage backend (Ceph, NFS, etc.) unavailability
  2. Ordered startup failures

    • Previous pod not ready, blocking next pod creation
    • Init container failures in sequential pods
    • Long initialization times exceeding timeouts
    • Dependencies between pods not being met
  3. Resource constraints

    • Insufficient CPU or memory on nodes
    • Resource quotas preventing pod creation
    • Node capacity exhausted
    • Resource limits too low for stateful application
  4. Pod failures

    • Application crashes during startup
    • Failed readiness or liveness probes
    • Database initialization failures
    • Configuration errors preventing startup
    • Data migration or recovery issues
  5. Image-related issues

    • Image pull failures
    • Missing or deleted container images
    • Registry authentication problems
    • Wrong image tags or versions
  6. Network issues

    • Headless service misconfiguration
    • DNS resolution failures for pod identity
    • Network policy blocking pod-to-pod communication
    • Port conflicts or binding failures
  7. StatefulSet update issues

    • Rolling update stuck on a failing pod
    • Pod disruption budgets preventing updates
    • Partition updates not progressing
    • OnDelete update strategy with pods not being deleted

Diagnostics

1. Identify StatefulSets that are not ready

# List all StatefulSets with replica counts
oc get statefulsets --all-namespaces -o wide

# Find StatefulSets where ready != desired
oc get statefulsets --all-namespaces -o json | jq '.items[] | select(.status.readyReplicas != .spec.replicas) | {namespace: .metadata.namespace, name: .metadata.name, desired: .spec.replicas, ready: .status.readyReplicas, current: .status.currentReplicas, updated: .status.updatedReplicas}'

2. Describe the problematic StatefulSet

# Replace <namespace> and <statefulset-name> with actual values
oc describe statefulset <statefulset-name> -n <namespace>

Look for:

  • Replicas section showing desired/current/ready counts
  • Update strategy and partition information
  • Pod Management Policy
  • Events showing errors or warnings
  • Volume claim templates

3. Check StatefulSet pods in order

# List pods for the StatefulSet (they have ordinal names: pod-0, pod-1, etc.)
oc get pods -n <namespace> -l app=<statefulset-label> --sort-by=.metadata.name

# Check which specific pod is not ready
oc get pods -n <namespace> -l app=<statefulset-label> -o wide

4. Investigate the first non-ready pod

# StatefulSets create pods in order, so check the lowest ordinal that's not ready
# Describe the problematic pod
oc describe pod <statefulset-name-0> -n <namespace>

# Check pod logs
oc logs <statefulset-name-0> -n <namespace>

# Check previous logs if crashed
oc logs <statefulset-name-0> -n <namespace> --previous

5. Check PersistentVolumeClaim status

# List PVCs for the StatefulSet
oc get pvc -n <namespace> -l app=<statefulset-label>

# Describe problematic PVCs
oc describe pvc <pvc-name> -n <namespace>

# Check if PV is bound
oc get pv | grep <namespace>

6. Check storage provisioning

# Check StorageClass availability
oc get storageclass

# Describe the StorageClass used by the StatefulSet
oc describe storageclass <storage-class-name>

# Check PV provisioner events
oc get events -n <namespace> --sort-by='.lastTimestamp' | grep -i "persistentvolume\|pvc"

7. Check headless service

# StatefulSets require a headless service for pod identity
# Get the service name from the StatefulSet
oc get statefulset <statefulset-name> -n <namespace> -o jsonpath='{.spec.serviceName}'

# Verify the headless service exists
oc get service <service-name> -n <namespace>

# Ensure it's headless (ClusterIP should be None)
oc get service <service-name> -n <namespace> -o jsonpath='{.spec.clusterIP}'

Solution

General troubleshooting steps:

  1. For PVC/storage binding issues:

    # Check PVC status
    oc get pvc -n <namespace>
    
    # If PVC is pending, check events
    oc describe pvc <pvc-name> -n <namespace>
    
    # Verify StorageClass exists and is default if needed
    oc get storageclass
    oc patch storageclass <storage-class-name> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
    
    # If PV is not available, check storage backend
    # For manual provisioning, create PV
    oc create -f persistentvolume.yaml
    
    # Delete pending PVC and let StatefulSet recreate it (use with caution)
    oc delete pvc <pvc-name> -n <namespace>
  2. For pod startup failures:

    # Check pod logs for errors
    oc logs <statefulset-name-0> -n <namespace>
    
    # If init container failed, check its logs
    oc logs <statefulset-name-0> -n <namespace> -c <init-container-name>
    
    # Check readiness probe configuration
    oc get statefulset <statefulset-name> -n <namespace> -o yaml | grep -A 10 readinessProbe
    
    # Adjust readiness probe timing if needed
    oc edit statefulset <statefulset-name> -n <namespace>
    # Increase initialDelaySeconds, periodSeconds, or failureThreshold
  3. For resource constraint issues:

    # Check resource requests and limits
    oc get statefulset <statefulset-name> -n <namespace> -o yaml | grep -A 10 resources
    
    # Check available node resources
    oc describe nodes | grep -A 5 "Allocated resources"
    
    # Adjust resource requests if too high
    oc edit statefulset <statefulset-name> -n <namespace>
    
    # Or increase namespace quota
    oc edit resourcequota <quota-name> -n <namespace>
  4. For headless service issues:

    # Verify headless service exists
    oc get service <service-name> -n <namespace>
    
    # Create headless service if missing
    cat <<EOF | oc apply -f -
    apiVersion: v1
    kind: Service
    metadata:
      name: <service-name>
      namespace: <namespace>
    spec:
      clusterIP: None
      selector:
        app: <statefulset-label>
      ports:
      - port: <port>
        name: <port-name>
    EOF
  5. For stuck rolling updates:

    # Check update strategy
    oc get statefulset <statefulset-name> -n <namespace> -o jsonpath='{.spec.updateStrategy}'
    
    # If using OnDelete strategy, manually delete pods in reverse order
    # Get highest ordinal first
    oc get pods -n <namespace> -l app=<statefulset-label> --sort-by=.metadata.name
    oc delete pod <statefulset-name-2> -n <namespace>
    
    # Wait for it to be recreated and ready before deleting the next
    oc wait --for=condition=ready pod/<statefulset-name-2> -n <namespace> --timeout=300s
    
    # If RollingUpdate is stuck, check if partition is set
    oc get statefulset <statefulset-name> -n <namespace> -o jsonpath='{.spec.updateStrategy.rollingUpdate.partition}'
    
    # Remove or adjust partition to allow updates
    oc patch statefulset <statefulset-name> -n <namespace> -p '{"spec":{"updateStrategy":{"rollingUpdate":{"partition":0}}}}'
  6. For image pull errors:

    # Check the image being used
    oc get statefulset <statefulset-name> -n <namespace> -o jsonpath='{.spec.template.spec.containers[*].image}'
    
    # Verify image pull secrets
    oc get secrets -n <namespace> | grep docker
    
    # Update to correct image
    oc set image statefulset/<statefulset-name> container-name=correct-image:tag -n <namespace>
  7. For ordered startup issues:

    # StatefulSets create pods in order (0, 1, 2, ...)
    # Identify which pod is blocking the sequence
    oc get pods -n <namespace> -l app=<statefulset-label> --sort-by=.metadata.name
    
    # Fix or delete the blocking pod
    oc delete pod <statefulset-name-N> -n <namespace>
    
    # Change pod management policy to Parallel if order doesn't matter
    oc patch statefulset <statefulset-name> -n <namespace> -p '{"spec":{"podManagementPolicy":"Parallel"}}'

Force recreation of StatefulSet pods (use with caution):

# Delete pods in reverse order (highest ordinal first)
# This maintains StatefulSet guarantees
oc delete pod <statefulset-name-2> -n <namespace>

# Wait for pod to be recreated and ready
oc wait --for=condition=ready pod/<statefulset-name-2> -n <namespace> --timeout=300s

# Continue with next pod
oc delete pod <statefulset-name-1> -n <namespace>
oc wait --for=condition=ready pod/<statefulset-name-1> -n <namespace> --timeout=300s

# Finally the first pod
oc delete pod <statefulset-name-0> -n <namespace>

Complete StatefulSet restart (use with extreme caution):

# This should only be done for non-production or after proper backup
# Scale down to 0
oc scale statefulset <statefulset-name> --replicas=0 -n <namespace>

# Wait for all pods to terminate
oc get pods -n <namespace> -l app=<statefulset-label> -w

# Scale back up
oc scale statefulset <statefulset-name> --replicas=<desired-count> -n <namespace>

Verify the fix:

# Check StatefulSet status
oc get statefulset <statefulset-name> -n <namespace>

# Verify READY column shows desired/ready match
# Example: 3/3 means all 3 replicas are ready

# Check all pods are running and ready
oc get pods -n <namespace> -l app=<statefulset-label> --sort-by=.metadata.name

# Verify replica counts match
oc get statefulset <statefulset-name> -n <namespace> -o jsonpath='{.spec.replicas}{" desired, "}{.status.readyReplicas}{" ready, "}{.status.currentReplicas}{" current, "}{.status.updatedReplicas}{" updated\n"}'

# Verify PVCs are all bound
oc get pvc -n <namespace> -l app=<statefulset-label>

# For critical stateful apps, verify application-level health
# Example for a database:
oc exec <statefulset-name-0> -n <namespace> -- <health-check-command>

Resources

Clone this wiki locally