🐳 Kubernetes Production Patterns

Battle-Tested Kubernetes Configurations for Enterprise Workloads

Features • Installation • Patterns • Documentation • Contributing

📖 Table of Contents

Problem Statement
Solution
Features
Technology Stack
Architecture
Prerequisites
Installation
Production Patterns
Helm Charts
Deployment Strategies
Best Practices
Monitoring & Observability
Security
Privacy
Troubleshooting
Contributing
Roadmap
License
Support

🎯 Problem Statement

Teams deploying Kubernetes in production face critical challenges:

Configuration Complexity: Kubernetes has 50+ resource types with intricate relationships and dependencies
Security Vulnerabilities: Default configurations lack essential security controls like network policies and pod security standards
Reliability Issues: Applications fail under load due to improper resource limits, missing health checks, and poor autoscaling
Operational Overhead: Managing deployments, rollbacks, and scaling manually is error-prone and time-consuming
Inconsistent Environments: Dev/staging/prod drift leads to "works on my machine" scenarios
Knowledge Gap: Production-grade Kubernetes requires deep expertise in networking, storage, security, and observability
Cost Overruns: Poor resource management leads to over-provisioning and wasted cloud spend

💡 Solution

Kubernetes Production Patterns provides proven, production-ready configurations and Helm charts that enable teams to:

✅ Deploy cloud-native applications with production-grade reliability from day one
✅ Implement security best practices including RBAC, network policies, and pod security
✅ Achieve high availability with proper resource management and autoscaling
✅ Standardize deployments across all environments with GitOps workflows
✅ Monitor and troubleshoot applications with integrated observability
✅ Accelerate Kubernetes adoption with comprehensive examples and documentation
✅ Reduce operational costs through optimized resource allocation

✨ Features

Production-Ready Patterns

🚀 Deployment Patterns: Blue-green, canary, rolling updates, and progressive delivery
📊 Autoscaling: HPA (Horizontal Pod Autoscaler), VPA (Vertical Pod Autoscaler), and cluster autoscaling
🔐 Security: NetworkPolicies, PodSecurityPolicies, RBAC, and secret management
💾 Storage: StatefulSets with persistent volumes and dynamic provisioning
🌐 Networking: Ingress controllers, service mesh integration, and DNS configurations
📈 Observability: Prometheus metrics, logging, tracing, and Grafana dashboards
🔄 GitOps: ArgoCD and Flux deployment workflows
⚡ Performance: Resource optimization and cost-efficient configurations
🔧 Self-Healing: Liveness/readiness probes, pod disruption budgets, and restart policies

Helm Charts Library

Microservices: REST APIs, gRPC services, message queues
Databases: PostgreSQL, MySQL, MongoDB, Redis, Cassandra
Message Brokers: Kafka, RabbitMQ, NATS, Pulsar
Caching: Redis, Memcached, Hazelcast
Web Applications: Node.js, Python, Java, Go, .NET
CI/CD Tools: Jenkins, GitLab Runner, Tekton, Argo Workflows

🛠️ Technology Stack

Category	Technologies
Orchestration	Kubernetes 1.28+, kubectl, kubeadm
Package Management	Helm 3.0+, Kustomize
GitOps	ArgoCD, Flux CD
Service Mesh	Istio, Linkerd, Consul
Ingress	NGINX Ingress, Traefik, Kong, HAProxy
Storage	Rook-Ceph, Longhorn, OpenEBS, AWS EBS, Azure Disk, GCP PD
Monitoring	Prometheus, Grafana, Alertmanager, Thanos
Logging	Loki, Elasticsearch, Fluentd, Fluent Bit
Tracing	Jaeger, Zipkin, OpenTelemetry, Tempo
Security	OPA Gatekeeper, Falco, Trivy, Kyverno, Vault
CI/CD	Tekton, GitHub Actions, GitLab CI, Jenkins X
Load Testing	k6, Locust, JMeter

🏛️ Architecture

High-Level Kubernetes Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Kubernetes Cluster                        │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │                  Ingress Layer                      │    │
│  │  [NGINX Ingress] → [Service Mesh (Istio)]         │    │
│  │  [TLS Termination] → [Rate Limiting]              │    │
│  └──────────────────────┬─────────────────────────────┘    │
│                         │                                   │
│  ┌──────────────────────┴─────────────────────────────┐    │
│  │              Application Layer                      │    │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐         │    │
│  │  │  Pod     │  │  Pod     │  │  Pod     │         │    │
│  │  │  (API)   │  │  (Web)   │  │ (Worker) │         │    │
│  │  │  + HPA   │  │  + HPA   │  │  + VPA   │         │    │
│  │  └────┬─────┘  └────┬─────┘  └────┬─────┘         │    │
│  │       │             │             │                │    │
│  │  [ConfigMap] [Secrets] [ServiceAccount]           │    │
│  └──────────────────────┬─────────────────────────────┘    │
│                         │                                   │
│  ┌──────────────────────┴─────────────────────────────┐    │
│  │              Data Layer                             │    │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐         │    │
│  │  │StatefulSet│ │ PVC      │  │  Redis   │         │    │
│  │  │(Postgres) │ │ (Storage)│  │  (Cache) │         │    │
│  │  │Multi-AZ   │ │ Encrypted│  │ Sentinel │         │    │
│  │  └──────────┘  └──────────┘  └──────────┘         │    │
│  └────────────────────────────────────────────────────┘    │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │           Observability Layer                       │    │
│  │  [Prometheus] [Grafana] [Loki] [Jaeger] [Falco]   │    │
│  └────────────────────────────────────────────────────┘    │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │              Security Layer                         │    │
│  │  [OPA] [Network Policies] [Pod Security] [Vault]   │    │
│  └────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘

📋 Prerequisites

Before using this project, ensure you have:

Kubernetes Cluster (v1.28 or higher)
- Managed: EKS, AKS, GKE, or
- Self-hosted: kubeadm, k3s, RKE, or
- Local: kind, minikube, k3d (for development)
kubectl v1.28+ (Installation Guide)
Helm 3.0+ (Installation Guide)
Git for version control
Basic Kubernetes knowledge: Pods, Deployments, Services, ConfigMaps

Cluster Requirements

Minimum: 3 nodes, 8GB RAM per node, 4 vCPUs per node
Recommended: 5+ nodes, 16GB RAM per node, 8 vCPUs per node
Storage: Dynamic provisioning enabled (StorageClass)
Networking: CNI plugin installed (Calico, Flannel, Cilium)

Recommended Tools

k9s: Terminal UI for Kubernetes cluster management
kubectx/kubens: Context and namespace switcher
stern: Multi-pod log tailing
kustomize: Template-free Kubernetes configuration
dive: Docker image layer analyzer
kubeval: Kubernetes manifest validator

📥 Installation

1. Clone the Repository

git clone https://github.com/cloud-premises/kubernetes-production-patterns.git
cd kubernetes-production-patterns

2. Verify Cluster Access

# Check cluster connection
kubectl cluster-info

# Verify nodes are ready
kubectl get nodes

# Check current context
kubectl config current-context

# View cluster resources
kubectl get all --all-namespaces

3. Install Required Tools

# Install Helm (if not already installed)
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# Verify Helm installation
helm version

# Install kubectl (if needed)
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin/

4. Add Helm Repositories

# Add common Helm repositories
helm repo add stable https://charts.helm.sh/stable
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add jetstack https://charts.jetstack.io
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

5. Create Namespace

# Create dedicated namespace for patterns
kubectl create namespace production-patterns

# Set as default namespace (optional)
kubectl config set-context --current --namespace=production-patterns

🚀 Production Patterns

Pattern 1: High-Availability Web Application

Complete deployment with autoscaling, health checks, and resource management.

# patterns/web-app/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: production
  labels:
    app: web-app
    version: v1.0.0
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
        version: v1.0.0
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      # Anti-affinity to spread pods across nodes
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - web-app
              topologyKey: kubernetes.io/hostname
      
      # Security context
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
      
      containers:
      - name: app
        image: myapp:v1.0.0
        imagePullPolicy: IfNotPresent
        
        ports:
        - name: http
          containerPort: 8080
          protocol: TCP
        
        # Resource management
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        
        # Health checks
        livenessProbe:
          httpGet:
            path: /health/live
            port: http
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        
        readinessProbe:
          httpGet:
            path: /health/ready
            port: http
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        
        # Graceful shutdown
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]
        
        # Environment variables
        env:
        - name: APP_ENV
          value: "production"
        - name: LOG_LEVEL
          value: "info"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
        - name: REDIS_HOST
          valueFrom:
            configMapKeyRef:
              name: app-config
              key: redis.host
        
        # Container security
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities:
            drop:
            - ALL
        
        # Volume mounts
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: cache
          mountPath: /app/cache
      
      volumes:
      - name: tmp
        emptyDir: {}
      - name: cache
        emptyDir: {}
---
# patterns/web-app/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: web-app
  namespace: production
  labels:
    app: web-app
spec:
  type: ClusterIP
  ports:
  - port: 80
    targetPort: http
    protocol: TCP
    name: http
  selector:
    app: web-app
---
# patterns/web-app/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30
      - type: Pods
        value: 2
        periodSeconds: 30
      selectPolicy: Max
---
# patterns/web-app/pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-app
  namespace: production
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: web-app

Deploy:

kubectl apply -f patterns/web-app/

Pattern 2: Canary Deployment

Progressive traffic shifting for safe releases.

# patterns/canary/deployment-stable.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-stable
  namespace: production
spec:
  replicas: 9
  selector:
    matchLabels:
      app: myapp
      version: stable
  template:
    metadata:
      labels:
        app: myapp
        version: stable
    spec:
      containers:
      - name: app
        image: myapp:v1.0.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
---
# patterns/canary/deployment-canary.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-canary
  namespace: production
spec:
  replicas: 1  # 10% of traffic
  selector:
    matchLabels:
      app: myapp
      version: canary
  template:
    metadata:
      labels:
        app: myapp
        version: canary
    spec:
      containers:
      - name: app
        image: myapp:v2.0.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
---
# patterns/canary/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp
  namespace: production
spec:
  selector:
    app: myapp  # Selects both stable and canary
  ports:
  - port: 80
    targetPort: 8080

Progressive Rollout:

# Start with 10% canary
kubectl apply -f patterns/canary/

# Monitor metrics
kubectl top pods -l app=myapp
watch kubectl get pods -l app=myapp

# Increase to 50%
kubectl scale deployment app-stable --replicas=5
kubectl scale deployment app-canary --replicas=5

# Full rollout
kubectl scale deployment app-stable --replicas=0
kubectl scale deployment app-canary --replicas=10

# Cleanup old version
kubectl delete deployment app-stable

Pattern 3: StatefulSet with Persistent Storage

Database deployment with ordered scaling and persistent volumes.

# patterns/statefulset/postgresql.yaml
apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: production
spec:
  ports:
  - port: 5432
    name: postgres
  clusterIP: None
  selector:
    app: postgres
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  namespace: production
spec:
  serviceName: postgres
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      terminationGracePeriodSeconds: 30
      containers:
      - name: postgres
        image: postgres:15-alpine
        ports:
        - containerPort: 5432
          name: postgres
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
        livenessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - pg_isready -U postgres
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - pg_isready -U postgres
          initialDelaySeconds: 5
          periodSeconds: 5
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 100Gi

Pattern 4: Network Policies

Microsegmentation for enhanced security.

# patterns/network-policy/default-deny.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
# patterns/network-policy/allow-web-to-api.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-web-to-api
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: web
    ports:
    - protocol: TCP
      port: 8080
---
# patterns/network-policy/allow-api-to-database.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-api-to-database
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: postgres
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: api
    ports:
    - protocol: TCP
      port: 5432
---
# patterns/network-policy/allow-dns.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    ports:
    - protocol: UDP
      port: 53

View All Patterns →

📦 Helm Charts

Available Charts

web-application: Generic web app with autoscaling
api-gateway: Kong/NGINX API gateway
microservice: REST/gRPC microservice template
database: PostgreSQL, MySQL, MongoDB
cache: Redis with Sentinel
message-queue: Kafka, RabbitMQ
monitoring-stack: Prometheus + Grafana
logging-stack: Loki + Promtail

Using Helm Charts

# Install a chart
helm install myapp ./charts/web-application \
  --namespace production \
  --values values-production.yaml

# Upgrade a release
helm upgrade myapp ./charts/web-application \
  --namespace production \
  --values values-production.yaml

# Rollback
helm rollback myapp 1

# Uninstall
helm uninstall myapp --namespace production

Example values.yaml

# values-production.yaml
replicaCount: 3

image:
  repository: myapp
  tag: v1.0.0
  pullPolicy: IfNotPresent

resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"
    cpu: "500m"

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

service:
  type: ClusterIP
  port: 80

ingress:
  enabled: true
  className: nginx
  hosts:
    - host: myapp.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: myapp-tls
      hosts:
        - myapp.example.com

postgresql:
  enabled: true
  auth:
    username: myapp
    database: myapp

redis:
  enabled: true
  master:
    persistence:
      enabled: true
      size: 8Gi

View All Charts →

🚢 Deployment Strategies

Rolling Update (Default)

Gradual replacement of old pods with new ones.

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%

Use when: Standard deployments, minimal downtime acceptable

Blue-Green Deployment

Two identical environments, instant traffic switch.

# Deploy green (new version)
kubectl apply -f deployment-green.yaml

# Test green environment
kubectl port-forward svc/app-green 8080:80

# Switch traffic
kubectl patch service app -p '{"spec":{"selector":{"version":"green"}}}'

# Cleanup blue
kubectl delete deployment app-blue

Use when: Zero-downtime requirement, instant rollback needed

Canary Deployment

Gradual traffic shift to new version.

Use when: Risk mitigation, gradual validation needed

Recreate Strategy

Delete all old pods before creating new ones.

spec:
  strategy:
    type: Recreate

Use when: Shared resources, no concurrent versions allowed

🎯 Best Practices

Resource Management

resources:
  requests:
    memory: "256Mi"  # Guaranteed
    cpu: "250m"      # 0.25 cores
  limits:
    memory: "512Mi"  # Maximum
    cpu: "500m"      # 0.5 cores

Guidelines:

Set requests based on average usage
Set limits 2x requests for burstable workloads
Use VPA to right-size resources

Health Checks

livenessProbe:
  httpGet:
    path: /health/live
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

Guidelines:

Liveness: Restart unhealthy pods
Readiness: Remove from service when not ready
Startup: For slow-starting applications

Configuration Management

# ConfigMaps for non-sensitive data
kubectl create configmap app-config --from-file=config.yaml

# Secrets for sensitive data
kubectl create secret generic db-credentials \
  --from-literal=username=admin \
  --from-literal=password=secret123

# Use external secret managers
# - AWS Secrets Manager
# - HashiCorp Vault
# - Azure Key Vault

Labels and Selectors

metadata:
  labels:
    app: myapp
    version: v1.0.0
    environment: production
    team: backend
    cost-center: engineering

Standard Labels:

app.kubernetes.io/name
app.kubernetes.io/instance
app.kubernetes.io/version
app.kubernetes.io/component
app.kubernetes.io/part-of
app.kubernetes.io/managed-by

📈 Monitoring & Observability

Prometheus Metrics

apiVersion: v1
kind: ServiceMonitor
metadata:
  name: myapp
  namespace: production
spec:
  selector:
    matchLabels:
      app: myapp
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics

Logging with Loki

apiVersion: v1
kind: ConfigMap
metadata:
  name: promtail-config
data:
  promtail.yaml: |
    server:
      http_listen_port: 9080
    clients:
      - url: http://loki:3100/loki/api/v1/push
    scrape_configs:
      - job_name: kubernetes-pods
        kubernetes_sd_configs:
          - role: pod

Distributed Tracing

# OpenTelemetry Collector
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-collector-config
data:
  config.yaml: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    exporters:
      jaeger:
        endpoint: jaeger:14250
    service:
      pipelines:
        traces:
          receivers: [otlp]
          exporters: [jaeger]

Dashboards

Pre-built Grafana dashboards included:

Kubernetes cluster overview
Pod resource usage
Application performance
Network traffic analysis
Storage metrics

🔒 Security

Security Features

Pod Security Standards: Baseline, restricted, and privileged policies
Network Policies: Microsegmentation and traffic control
RBAC: Role-based access control for users and service accounts
Secrets Management: Integration with Vault, AWS Secrets Manager, Azure Key Vault
Image Scanning: Trivy integration for vulnerability scanning
Runtime Security: Falco for threat detection
Policy Enforcement: OPA Gatekeeper for admission control

Pod Security Policy

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    - 'persistentVolumeClaim'
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
  readOnlyRootFilesystem: true

RBAC Configuration

apiVersion: v1
kind: ServiceAccount
metadata:
  name: myapp
  namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: myapp
  namespace: production
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: myapp
  namespace: production
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: myapp
subjects:
- kind: ServiceAccount
  name: myapp
  namespace: production

Image Scanning

# Scan container images with Trivy
trivy image myapp:v1.0.0

# Scan Kubernetes manifests
trivy k8s --report summary deployment.yaml

# Continuous scanning
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/trivy-operator/main/deploy/static/trivy-operator.yaml

Security Best Practices

Run as non-root user
Use read-only root filesystem
Drop all capabilities
Enable network policies
Scan images for vulnerabilities
Encrypt secrets at rest
Use namespaces for isolation
Implement RBAC policies
Enable audit logging
Regular security updates

Vulnerability Reporting

If you discover a security vulnerability, please email: security@cloudpremises.org

Do NOT create public GitHub issues for security vulnerabilities.

Response Timeline:

Initial acknowledgment: Within 24 hours
Status updates: Every 48 hours
Resolution target: Within 14 days for critical issues

We follow responsible disclosure and will credit security researchers who report vulnerabilities responsibly.

🔐 Privacy

Data Collection

This project does NOT collect, store, or transmit any personal or sensitive data. All configurations are deployed in your Kubernetes cluster under your complete control.

Third-Party Services

The only external services potentially accessed are:

Container Registries: Docker Hub, Quay.io, or private registries (required)
Helm Chart Repositories: For dependency management (optional)
Monitoring Backends: If configured (Grafana Cloud, Datadog, etc.)

Data Sovereignty

All workloads run in your Kubernetes cluster
No data leaves your infrastructure by default
You control all observability and logging endpoints
Full compliance with data residency requirements

Compliance Support

Our patterns support compliance with:

GDPR (General Data Protection Regulation)
HIPAA (Health Insurance Portability and Accountability Act)
PCI-DSS (Payment Card Industry Data Security Standard)
SOC 2 Type II
ISO 27001
FedRAMP (Federal Risk and Authorization Management Program)

Telemetry

No telemetry or usage data is collected by this project. All metrics and logs remain in your infrastructure.

🐛 Troubleshooting

Common Issues

Issue: Pods stuck in Pending state

# Check pod events
kubectl describe pod <pod-name>

# Common causes:
# - Insufficient resources
kubectl top nodes
kubectl describe nodes

# - No available storage
kubectl get pvc
kubectl get storageclass

# - Pod affinity/anti-affinity conflicts
kubectl get pod <pod-name> -o yaml | grep -A 10 affinity

Issue: Pods in CrashLoopBackOff

# Check logs
kubectl logs <pod-name> --previous

# Check events
kubectl describe pod <pod-name>

# Common causes:
# - Application crashes
# - Missing dependencies
# - Incorrect environment variables
# - Health check failures

Issue: Service not accessible

# Check service endpoints
kubectl get endpoints <service-name>

# Verify pod labels match service selector
kubectl get pods --show-labels
kubectl describe service <service-name>

# Test from within cluster
kubectl run test --rm -it --image=busybox -- wget -O- http://<service-name>

Issue: High memory usage / OOMKilled

# Check resource limits
kubectl describe pod <pod-name> | grep -A 5 Limits

# View actual usage
kubectl top pod <pod-name>

# Solution: Increase memory limits or optimize application
kubectl set resources deployment <name> -c=<container> --limits=memory=1Gi

Issue: ImagePullBackOff

# Check image pull errors
kubectl describe pod <pod-name>

# Common causes:
# - Image doesn't exist
# - Wrong image name/tag
# - Private registry authentication
# - Rate limiting

# Create image pull secret
kubectl create secret docker-registry regcred \
  --docker-server=<registry-url> \
  --docker-username=<username> \
  --docker-password=<password>

Issue: Network policy blocking traffic

# List network policies
kubectl get networkpolicies

# Describe policy
kubectl describe networkpolicy <policy-name>

# Temporarily disable for testing
kubectl delete networkpolicy <policy-name>

# Test connectivity
kubectl run test --rm -it --image=busybox -- nc -zv <service> <port>

Debugging Commands

# Get detailed pod information
kubectl get pod <pod-name> -o yaml

# Execute commands in container
kubectl exec -it <pod-name> -- /bin/sh

# Port forward for local testing
kubectl port-forward pod/<pod-name> 8080:8080

# View cluster events
kubectl get events --sort-by='.lastTimestamp'

# Check resource usage
kubectl top nodes
kubectl top pods --all-namespaces

# View logs with timestamps
kubectl logs <pod-name> --timestamps=true

# Stream logs from multiple pods
kubectl logs -f -l app=myapp

# Check API server logs
kubectl logs -n kube-system kube-apiserver-<node>

Performance Optimization

# Analyze resource requests vs limits
kubectl describe nodes | grep -A 5 "Allocated resources"

# Find pods with no resource limits
kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.containers[].resources.limits == null) | .metadata.name'

# Identify pods using most resources
kubectl top pods --all-namespaces --sort-by=memory
kubectl top pods --all-namespaces --sort-by=cpu

View Full Troubleshooting Guide →

🤝 Contributing

We welcome contributions from the community! Here's how you can help:

Ways to Contribute

🐛 Report bugs and issues
💡 Suggest new patterns or improvements
📖 Improve documentation
🔧 Submit bug fixes
✨ Add new deployment patterns
📊 Create Helm charts
🧪 Write tests
🎨 Share production experiences

How to Contribute

Fork the repository
Create a feature branch (git checkout -b feature/new-pattern)
Commit your changes (git commit -m 'Add canary deployment pattern')
Push to the branch (git push origin feature/new-pattern)
Open a Pull Request

Contribution Guidelines

Follow Kubernetes best practices
Include comprehensive documentation
Add examples for new patterns
Test configurations in a real cluster
Update relevant documentation
Sign commits with GPG key (recommended)

Pattern Submission Checklist

Pattern addresses a real production use case
Includes complete YAML manifests
Has deployment instructions
Includes monitoring/observability configuration
Documents resource requirements
Provides troubleshooting guide
Tested in at least one environment

Development Setup

# Clone your fork
git clone https://github.com/YOUR_USERNAME/kubernetes-production-patterns.git
cd kubernetes-production-patterns

# Add upstream remote
git remote add upstream https://github.com/cloud-premises/kubernetes-production-patterns.git

# Create local cluster for testing
kind create cluster --name test-patterns

# Set context
kubectl cluster-info --context kind-test-patterns

# Test pattern
kubectl apply -f patterns/your-pattern/

# Validate
kubectl get all -n your-namespace

Code Review Process

Automated checks run on all PRs
Pattern validation in test cluster
Documentation review
Maintainer review within 48-72 hours
Address feedback
Two approvals required for merge
Squash and merge to main

Commit Message Format

Follow Conventional Commits:

feat: add blue-green deployment pattern
fix: correct resource limits in web-app pattern
docs: update installation instructions
test: add validation for StatefulSet pattern
chore: update Helm chart dependencies

Read Full Contributing Guide →

🗺️ Roadmap

Current Focus (Q4 2024 - Q1 2025)

✅ High-availability deployment patterns
✅ Autoscaling configurations (HPA, VPA)
✅ StatefulSet patterns with persistent storage
✅ Network policy templates
✅ Security best practices
🚧 Service mesh integration (Istio, Linkerd)
🚧 GitOps patterns (ArgoCD, Flux)
🚧 Progressive delivery with Flagger

Upcoming (Q2 2025)

📋 Multi-cluster deployments
📋 Disaster recovery patterns
📋 Cost optimization strategies
📋 AI/ML workload patterns
📋 Serverless Kubernetes (Knative)
📋 Advanced monitoring dashboards
📋 Chaos engineering patterns

Future Enhancements (2025)

Multi-cloud Kubernetes patterns
Edge computing deployments
IoT workload patterns
Batch processing frameworks
Hybrid cloud configurations
Kubernetes operator development guide
Performance tuning playbook

Community Requests

Vote on features at: GitHub Discussions

View Detailed Roadmap →

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Copyright 2025 Cloud Premises

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Third-Party Licenses

This project includes patterns and examples for various open-source projects:

Kubernetes (Apache 2.0)
Helm (Apache 2.0)
Prometheus (Apache 2.0)
Grafana (AGPL 3.0)

See THIRD_PARTY_LICENSES.md for complete details.

💬 Support

Community Support (Free)

📧 Email: opensource@cloudpremises.org
💬 Discord: Join our community
📖 Documentation: docs.cloudpremises.org
🐛 Issue Tracker: GitHub Issues
💡 Discussions: GitHub Discussions
🎥 YouTube: Tutorials and walkthroughs

Professional Support (Paid)

For enterprise support, consulting, or custom pattern development:

📧 Email: consulting@cloudpremises.org
🌐 Website: cloudpremises.org
📞 Schedule: Book a consultation

Enterprise Support Includes:

24/7 critical issue support
Dedicated Slack/Teams channel
Custom pattern development
Architecture review and optimization
On-site/remote training workshops
Migration assistance
SLA-backed response times

Response Times (Community)

Critical Production Issues: 24 hours
Bug Reports: 2-3 business days
Pattern Requests: 1 week
Questions: 3-5 business days

Training & Workshops

We offer:

Kubernetes fundamentals
Production deployment strategies
Security best practices
Observability and monitoring
GitOps workflows
Custom team training

Contact: training@cloudpremises.org

🙏 Acknowledgments

This project wouldn't be possible without:

Kubernetes Community for the amazing platform
CNCF for fostering cloud-native technologies
Helm Community for package management
All Contributors who have shared patterns and improvements
Production Users providing real-world feedback

Built With

Kubernetes - Container Orchestration
Helm - Package Manager
Prometheus - Monitoring
Grafana - Visualization
ArgoCD - GitOps
Istio - Service Mesh

Inspired By

Kubernetes Official Documentation
Google's Site Reliability Engineering practices
Netflix's container orchestration patterns
Weaveworks' GitOps methodology
CNCF best practices

Contributors

Thanks to all our contributors! 🎉

📊 Project Stats

🌟 Stargazers Over Time

📚 Additional Resources

Official Documentation

Community

Made with ❤️ by Cloud Premises

⭐ If this project helps you, please star it on GitHub!

Website • Documentation • Blog • Twitter

Empowering teams to run Kubernetes confidently in production

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Uh oh!