-
Notifications
You must be signed in to change notification settings - Fork 10
K8s ‐ Verify internal image registry
This rule validates that the OpenShift internal image registry is properly configured and available for use. The internal image registry is a built-in container image registry that runs within the OpenShift cluster, allowing users to push and pull container images without requiring an external registry.
The rule performs a two-step validation:
- Prerequisite check: Verifies that the image registry management state is set to "Managed" (not "Removed" or "Unmanaged")
-
Availability check: If managed, ensures that all registry pods in the
openshift-image-registrynamespace are running and ready
The internal registry is a critical component for CI/CD pipelines, image builds, and development workflows within OpenShift.
- The
occommand-line tool configured and authenticated - Access to view image registry configuration and pods
- Cluster administrator privileges for registry configuration changes
An unavailable or misconfigured internal image registry can lead to:
- Deployment failures: Applications configured to pull from internal registry cannot deploy
- Template deployment failures: Templates referencing internal registry images cannot instantiate
Common scenarios that may cause the internal registry to be unavailable include:
-
Management state configuration
- Registry intentionally set to "Removed" state
- Registry set to "Unmanaged" state (for external registry use)
- Fresh cluster installation without registry configuration
- Registry disabled during cluster upgrade or maintenance
-
Storage configuration issues
- No storage backend configured for registry
- PersistentVolumeClaim (PVC) binding failures
- StorageClass unavailable or misconfigured
- Insufficient storage capacity
- Storage backend (S3, Azure Blob, GCS, Swift, etc.) unavailable
- Incorrect storage credentials or permissions
-
Pod failures
- Registry pods crashing or failing to start
- Failed readiness or liveness probes
- Container image pull failures
- Insufficient node resources to schedule registry pods
- Configuration errors in registry deployment
-
Resource constraints
- Insufficient CPU or memory on nodes
- Registry pods evicted due to resource pressure
- Node capacity exhausted
- Resource quotas blocking pod creation
# Get image registry configuration
oc get config.imageregistry.operator.openshift.io cluster -o yaml
# Check management state specifically
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.managementState}'
# Should return: Managed
# Other values: Removed, Unmanaged# List all pods in the registry namespace
oc get pods -n openshift-image-registry
# Look for image-registry pods specifically
oc get pods -n openshift-image-registry -l docker-registry=default
# Check pod details
oc describe pod -n openshift-image-registry -l docker-registry=default# View logs from registry pods
oc logs -n openshift-image-registry deployment/image-registry
# Check for errors
oc logs -n openshift-image-registry deployment/image-registry | grep -i "error\|fail\|fatal"
# Follow logs in real-time
oc logs -n openshift-image-registry deployment/image-registry -f# Get complete registry configuration
oc get config.imageregistry.operator.openshift.io cluster -o yaml
# Check replica count
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.replicas}'
# Check rollout strategy
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.rolloutStrategy}'-
For registry in Removed or Unmanaged state:
# Check current management state oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.managementState}' # Set management state to Managed oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"managementState":"Managed"}}' # Verify the change oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.managementState}'
-
For missing storage configuration:
# For empty storage configuration, set storage backend # Option 1: EmptyDir (for testing only, not for production) oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"emptyDir":{}}}}' # Option 2: PVC (recommended for on-premises) oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"pvc":{"claim":""}}}}' # Option 3: S3 (for AWS) oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"s3":{"bucket":"my-registry-bucket","region":"us-east-1"}}}}' # Option 4: Azure Blob oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"azure":{"accountName":"myaccount","container":"registry"}}}}' # Option 5: GCS (for Google Cloud) oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"storage":{"gcs":{"bucket":"my-registry-bucket"}}}}'
-
For PVC binding issues:
# Check PVC status oc get pvc -n openshift-image-registry # If PVC is pending, check StorageClass oc get storageclass # Set default StorageClass if needed oc patch storageclass <storage-class-name> -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' # Delete and recreate PVC if corrupted oc delete pvc -n openshift-image-registry image-registry-storage # Manually create PVC with specific StorageClass cat <<EOF | oc apply -f - apiVersion: v1 kind: PersistentVolumeClaim metadata: name: image-registry-storage namespace: openshift-image-registry spec: accessModes: - ReadWriteMany resources: requests: storage: 100Gi storageClassName: <storage-class-name> EOF
-
For pod failures:
# Check pod logs for errors oc logs -n openshift-image-registry deployment/image-registry # Describe pods to see events oc describe pod -n openshift-image-registry -l docker-registry=default # Delete problematic pods to force recreation oc delete pod -n openshift-image-registry -l docker-registry=default # Wait for new pods to start oc get pods -n openshift-image-registry -w
# For a fresh start, remove and recreate registry configuration
# 1. Set to Removed
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"managementState":"Removed"}}'
# 2. Wait for pods to be deleted
oc get pods -n openshift-image-registry -w
# 3. Configure storage and set to Managed
oc patch config.imageregistry.operator.openshift.io cluster --type merge -p '{"spec":{"managementState":"Managed","storage":{"pvc":{"claim":""}}}}'
# 4. Verify registry comes up
oc get pods -n openshift-image-registry -w# Check management state
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.managementState}'
# Should return: Managed
# Check operator status
oc get clusteroperator image-registry
# All conditions should be True/False/False (Available/Progressing/Degraded)
# Check registry pods are running
oc get pods -n openshift-image-registry
# Should show image-registry pods in Running state with READY 1/1
# Check storage is configured
oc get config.imageregistry.operator.openshift.io cluster -o jsonpath='{.spec.storage}'
# Should show storage backend configuration
# Test registry functionality
oc get imagestreams -n openshift
# Should list imagestreams without errors