K8s ‐ Verify internal image registry

K8s - Verify internal image registry is configured and available

Description

This rule validates that the OpenShift internal image registry is properly configured and available for use. The internal image registry is a built-in container image registry that runs within the OpenShift cluster, allowing users to push and pull container images without requiring an external registry.

The rule performs a two-step validation:

Prerequisite check: Verifies that the image registry management state is set to "Managed" (not "Removed" or "Unmanaged")
Availability check: If managed, ensures that all registry pods in the openshift-image-registry namespace are running and ready

The internal registry is a critical component for CI/CD pipelines, image builds, and development workflows within OpenShift.

Prerequisites

The oc command-line tool configured and authenticated
Access to view image registry configuration and pods
Cluster administrator privileges for registry configuration changes

Impact

An unavailable or misconfigured internal image registry can lead to:

Deployment failures: Applications configured to pull from internal registry cannot deploy
Template deployment failures: Templates referencing internal registry images cannot instantiate

Root Cause

Common scenarios that may cause the internal registry to be unavailable include:

Management state configuration
- Registry intentionally set to "Removed" state
- Registry set to "Unmanaged" state (for external registry use)
- Fresh cluster installation without registry configuration
- Registry disabled during cluster upgrade or maintenance
Storage configuration issues
- No storage backend configured for registry
- PersistentVolumeClaim (PVC) binding failures
- StorageClass unavailable or misconfigured
- Insufficient storage capacity
- Storage backend (S3, Azure Blob, GCS, Swift, etc.) unavailable
- Incorrect storage credentials or permissions
Pod failures
- Registry pods crashing or failing to start
- Failed readiness or liveness probes
- Container image pull failures
- Insufficient node resources to schedule registry pods
- Configuration errors in registry deployment
Resource constraints
- Insufficient CPU or memory on nodes
- Registry pods evicted due to resource pressure
- Node capacity exhausted
- Resource quotas blocking pod creation

Diagnostics

1. Check image registry management state

# Get image registry configuration
oc get config.imageregistry.operator.openshift.io cluster -o yaml

# Check management state specifically
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.managementState}'

# Should return: Managed
# Other values: Removed, Unmanaged

2. Check registry pod status

# List all pods in the registry namespace
oc get pods -n openshift-image-registry

# Look for image-registry pods specifically
oc get pods -n openshift-image-registry -l docker-registry=default

# Check pod details
oc describe pod -n openshift-image-registry -l docker-registry=default

3. Check registry pod logs

# View logs from registry pods
oc logs -n openshift-image-registry deployment/image-registry

# Check for errors
oc logs -n openshift-image-registry deployment/image-registry | grep -i "error\|fail\|fatal"

# Follow logs in real-time
oc logs -n openshift-image-registry deployment/image-registry -f

4. Verify registry configuration

# Get complete registry configuration
oc get config.imageregistry.operator.openshift.io cluster -o yaml

# Check replica count
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.replicas}'

# Check rollout strategy
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.rolloutStrategy}'

Solution

General troubleshooting steps:

For registry in Removed or Unmanaged state:

# Check current management state
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.managementState}'

# Set management state to Managed
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"managementState":"Managed"}}'

# Verify the change
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.managementState}'

For missing storage configuration:

# For empty storage configuration, set storage backend

# Option 1: EmptyDir (for testing only, not for production)
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"emptyDir":{}}}}'

# Option 2: PVC (recommended for on-premises)
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"pvc":{"claim":""}}}}'

# Option 3: S3 (for AWS)
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"s3":{"bucket":"my-registry-bucket","region":"us-east-1"}}}}'

# Option 4: Azure Blob
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"azure":{"accountName":"myaccount","container":"registry"}}}}'

# Option 5: GCS (for Google Cloud)
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"gcs":{"bucket":"my-registry-bucket"}}}}'

For PVC binding issues:

# Check PVC status
oc get pvc -n openshift-image-registry

# If PVC is pending, check StorageClass
oc get storageclass

# Set default StorageClass if needed
oc patch storageclass <storage-class-name> -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

# Delete and recreate PVC if corrupted
oc delete pvc -n openshift-image-registry image-registry-storage

# Manually create PVC with specific StorageClass
cat <<EOF | oc apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: image-registry-storage
  namespace: openshift-image-registry
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 100Gi
  storageClassName: <storage-class-name>
EOF

For pod failures:

# Check pod logs for errors
oc logs -n openshift-image-registry deployment/image-registry

# Describe pods to see events
oc describe pod -n openshift-image-registry -l docker-registry=default

# Delete problematic pods to force recreation
oc delete pod -n openshift-image-registry -l docker-registry=default

# Wait for new pods to start
oc get pods -n openshift-image-registry -w

Complete registry reconfiguration:

# For a fresh start, remove and recreate registry configuration

# 1. Set to Removed
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"managementState":"Removed"}}'

# 2. Wait for pods to be deleted
oc get pods -n openshift-image-registry -w

# 3. Configure storage and set to Managed
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"managementState":"Managed","storage":{"pvc":{"claim":""}}}}'

# 4. Verify registry comes up
oc get pods -n openshift-image-registry -w

Verify the fix:

# Check management state
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.managementState}'
# Should return: Managed

# Check operator status
oc get clusteroperator image-registry
# All conditions should be True/False/False (Available/Progressing/Degraded)

# Check registry pods are running
oc get pods -n openshift-image-registry
# Should show image-registry pods in Running state with READY 1/1

# Check storage is configured
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.storage}'
# Should show storage backend configuration

# Test registry functionality
oc get imagestreams -n openshift
# Should list imagestreams without errors

K8s ‐ Verify internal image registry

K8s - Verify internal image registry is configured and available

Description

Prerequisites

Impact

Root Cause

Diagnostics

1. Check image registry management state

2. Check registry pod status

3. Check registry pod logs

4. Verify registry configuration

Solution

General troubleshooting steps:

Complete registry reconfiguration:

Verify the fix:

Resources

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally