This repository contains Kubernetes/OpenShift deployment configurations for the CloudKit platform, providing comprehensive cluster lifecycle management through multiple integrated components.
CloudKit is a comprehensive platform that provides:
- Cluster Lifecycle Management - Automated cluster provisioning, scaling, and decommissioning
- Event Driven Automation - Responds to cluster events and webhook notifications
- Service Management - Fulfillment services for cluster operations
- Configuration as Code - Manages cluster configurations through Ansible playbooks and Kubernetes manifests
The CloudKit platform consists of three main components:
- CloudKit AAP (Ansible Automation Platform) - Automated cluster provisioning and lifecycle management
- CloudKit Operator - Kubernetes operator for cluster order management and HyperShift integration
- Fulfillment Service - Backend service for cluster fulfillment operations with PostgreSQL database
Before deploying CloudKit, ensure you have:
- OpenShift cluster with admin access (version 4.17+ recommended)
ocCLI configured with cluster admin privilegeskustomizeCLI tool (or useoc apply -k)
- cert-manager operator installed and configured
- Certificate issuers configured for TLS certificate management
- Required for secure communication between components
- MultiCluster Engine (MCE) installed with HyperShift support
- HyperShift operator deployed and configured
- Required for hosted cluster management capabilities
- Red Hat Ansible Automation Platform Operator installed
- Red Hat Advanced Cluster Management (ACM) installed (for cluster provisioning)
- Valid AAP license manifest - Download from Red Hat Customer Portal as
License.zip - Container registry credentials for execution environments
- HyperShift CRDs (
HostedCluster,NodePool) available - ClusterOrder CRDs deployed
- Proper RBAC permissions for cluster-wide operations
- PostgreSQL for database storage
- TLS certificates for secure database connections
- Private registry access for container images
cloudkit-installer/
├── base/ # Base Kustomize configurations
│ ├── shared/ # Shared namespace and resources
│ ├── cloudkit-aap/ # Ansible Automation Platform
│ ├── cloudkit-operator/ # CloudKit Operator
│ └── fulfillment-service/ # Fulfillment Service
│ ├── ca/ # Certificate Authority setup
│ ├── database/ # PostgreSQL database
│ ├── service/ # Main service deployment
│ ├── controller/ # Controller component
│ ├── admin/ # Admin service account
│ └── client/ # Client service account
├── overlays/ # Environment-specific overlays
│ └── development/ # Development environment
│ ├── cloudkit-aap/
│ ├── cloudkit-operator/
│ └── fulfillment-service/
└── README.md
Provides automated cluster provisioning and lifecycle management through:
- Controller: Job template management and execution
- EDA (Event Driven Automation): Webhook processing and event handling
- Bootstrap Job: Initial configuration of AAP resources
Kubernetes operator that manages:
- ClusterOrder CRDs: Custom resources for cluster provisioning requests
- HyperShift Integration: Management of hosted clusters
- Namespace Management: Automatic namespace creation and RBAC
- Service Account Management: Cluster-specific service accounts
Backend service providing:
- Database: PostgreSQL for persistent storage
- Service: Main fulfillment service with gRPC API
- Controller: Fulfillment operation management
- Gateway: HTTP/gRPC gateway with Envoy proxy
# Install cert-manager operator
oc apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
# Wait for cert-manager to be ready
oc wait --for=condition=Available deployment/cert-manager -n cert-manager --timeout=300s# Create MCE namespace
oc new-project multicluster-engine
# Install MCE operator
cat << EOF | oc apply -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: multicluster-engine
namespace: multicluster-engine
spec:
targetNamespaces:
- multicluster-engine
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: multicluster-engine
namespace: multicluster-engine
spec:
channel: stable-2.8
name: multicluster-engine
source: redhat-operators
sourceNamespace: openshift-marketplace
EOF
# Create MCE instance
cat << EOF | oc apply -f -
apiVersion: multicluster.openshift.io/v1
kind: MultiClusterEngine
metadata:
name: multiclusterengine
namespace: multicluster-engine
spec:
availabilityConfig: Basic
targetNamespace: multicluster-engine
EOF
# Wait for MCE to be ready
oc wait --for=condition=Available multiclusterengine/multiclusterengine -n multicluster-engine --timeout=600s# Set required environment variables
export AAP_USERNAME="admin"
export AAP_PASSWORD="your-aap-password"
export LICENSE_MANIFEST_PATH="/path/to/license.zip"
# Deploy all components
oc apply -k overlays/development/
# Wait for deployment to complete
oc wait --for=condition=Available deployment/dev-fulfillment-service -n foobar --timeout=300s
oc wait --for=condition=Available deployment/dev-controller-manager -n foobar --timeout=300sFor AAP configuration, set these environment variables:
export AAP_USERNAME="admin" # AAP administrator username
export AAP_PASSWORD="your-password" # AAP administrator password
export LICENSE_MANIFEST_PATH="/path/to/license.zip" # Path to AAP licenseNote: The AAP license file must be named exactly License.zip (with capital L) and can be downloaded from the Red Hat Customer Portal. Navigate to your AAP subscription and download the license manifest.
Update container registry credentials in:
overlays/development/dockerconfig.jsonfor development- Include credentials for accessing private registries (quay.io, registry.redhat.io, etc.)
The fulfillment service uses cert-manager for TLS certificate management:
- CA certificates are automatically generated
- Service certificates are issued for database connections
- API certificates are issued for service endpoints
# Check all pods in the deployment namespace
oc get pods -n foobar
# Check specific components
oc get pods -n foobar -l app=fulfillment-service
oc get pods -n foobar -l app.kubernetes.io/name=cloudkit-operator
oc get ansibleautomationplatform -n foobar
# Check certificates
oc get certificates -n foobar# CloudKit Operator
oc logs -n foobar deployment/dev-controller-manager -f
# Fulfillment Service
oc logs -n foobar deployment/dev-fulfillment-service -c server -f
# Database
oc logs -n foobar statefulset/dev-fulfillment-database -f
# AAP Bootstrap Job
oc logs -n foobar job/dev-aap-bootstrap -f# Check cert-manager
oc get pods -n cert-manager
# Check HyperShift CRDs
oc get crd | grep hypershift
oc get crd | grep clusterorder
# Check MultiCluster Engine
oc get multiclusterengine -n multicluster-engineTo create a new environment overlay:
- Create new directory under
overlays/ - Copy and modify kustomization.yaml from development
- Create environment-specific patch files
- Update environment variables and secrets
Each component can be customized by:
- Editing base configurations in
base/component-name/ - Creating overlay patches for environment-specific changes
- Testing changes in development overlay first
- Validating with
kustomize buildbefore applying
- All inter-service communication uses TLS
- Certificates are managed by cert-manager
- CA certificates are automatically rotated
- Each component has minimal required permissions
- Service accounts are created per component
- Cluster-wide permissions are limited to necessary operations
- Services communicate over TLS
- Database connections use SSL/TLS
- Network policies can be applied for additional isolation
- cert-manager not ready: Ensure cert-manager operator is installed and running
- HyperShift CRDs missing: Verify MultiCluster Engine is deployed with HyperShift enabled
- Certificate issues: Check cert-manager logs and certificate status
- Database connection failures: Verify database certificates and connectivity
- cloudkit-operator CrashLoopBackOff: Usually indicates missing HyperShift permissions or CRDs not available
- ImagePullBackOff errors: Verify registry credentials in
dockerconfig.jsonanddev-quay-pull-secret - namePrefix conflicts: Certificate and secret name mismatches due to kustomize namePrefix application
# Check certificate status
oc describe certificate -n foobar
# Check certificate issuer status
oc describe issuer -n foobar
# Check pod events
oc describe pod -n foobar <pod-name>
# Check service endpoints
oc get endpoints -n foobar
# Check secrets
oc get secrets -n foobar
# Check HyperShift CRDs and permissions
oc get crd | grep hypershift
oc get clusterrole dev-manager-role -o yaml | grep -A 10 hypershift
oc get clusterrolebinding dev-manager-rolebinding -o yaml
# Check MultiCluster Engine status
oc get multiclusterengine -n multicluster-engine
oc get pods -n multicluster-engine
oc get pods -n hypershift# Get all events in namespace
oc get events -n foobar --sort-by=.metadata.creationTimestamp
# Check resource usage
oc top pods -n foobar
# Component-specific logs
oc logs -n foobar deployment/dev-fulfillment-service -c server --tail=100
oc logs -n foobar deployment/dev-controller-manager --tail=100
oc logs -n foobar statefulset/dev-fulfillment-database --tail=100- Understanding of Kubernetes/OpenShift
- Familiarity with Kustomize
- Knowledge of cert-manager and HyperShift
- Experience with PostgreSQL and gRPC services
- Test in development environment first
- Validate with
kustomize build overlays/development/ - Check for resource conflicts
- Verify certificate generation
- Test service connectivity
For issues and questions:
- Check the troubleshooting section above
- Review component logs for error messages
- Verify prerequisites are properly installed
- Consult cert-manager and HyperShift documentation
- Open an issue in the component repository
This project is licensed under the Apache License, Version 2.0.