Complete, production-ready CI/CD workflow demonstrating AI-assisted auto-fixing, GitOps with ArgoCD, canary deployments with Argo Rollouts, and MTTR dashboard.
- Architecture Overview
- Key Features
- Workflow Explanation
- Project Structure
- Prerequisites
- Setup Instructions
- Running the CI/CD Pipeline
- Monitoring & Dashboards
- Troubleshooting
- Client Presentation Points
Developer Push to 'demo' Branch
β
GitHub Actions Triggers
β
βββββββββββββββββββββββββββββββββββββββ
β 1. Security Scan (Trivy) β
β 2. Run Tests (Jest) β
βββββββββββββββββββββββββββββββββββββββ
β
Tests Failed?
/ \
YES NO
β β
ββββββββββββββββββββ βββββββββββββββββββ
β 3a. AI Auto-Fix β β 3b. Skip Auto- β
β (GPT-4) β β fix, Continueβ
β - Analyze Errors β βββββββββββββββββββ
β - Generate Fix β β
β - Create PR β ββββββββββββββββββββββββ
β - Commit Changes β β 4. Build Docker β
ββββββββββββββββββββ β Image β
β β - Multi-stage build β
Tests Pass β - Scan for vulns β
β β - Push to GHCR β
ββββββββββββββββββββββββββββββββββββββββ
β 5. Update Kubernetes Manifest β
β - Update image tag in rollout.yaml β
β - Update MTTR dashboard JSON β
β - Commit changes to demo branch β
ββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββββββ
β 6. ArgoCD Detects Changes β
β - Polls repository (or webhook) β
β - Syncs manifests to Kubernetes β
β - Creates/Updates Rollout resource β
ββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββββββ
β 7. Canary Rollout Progresses β
β Step 1: 10% traffic (5 min) β β
β Step 2: 25% traffic (5 min) β β
β Step 3: 50% traffic (5 min) β β
β Step 4: 75% traffic (5 min) β β
β Step 5: 100% traffic (Complete) β
β β
β Automatic rollback if metrics fail β
ββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββββββββ
β 8. MTTR Dashboard Updates β
β - Record deployment metrics β
β - Track canary progression β
β - Calculate MTTR (Mean Time to β
β Recovery) β
ββββββββββββββββββββββββββββββββββββββββ
- Automatically detects test failures
- Calls OpenAI GPT-4 API to analyze issues
- Generates code fixes
- Creates pull request with fixes
- Allows team review before merge
Example Flow:
Test Fails β AI Analyzes β PR Created β Team Reviews β Merge β Deploy
- Single source of truth: Git repository
- ArgoCD continuously syncs manifests
- Changes to Git automatically deploy
- Built-in rollback capabilities
- Declarative infrastructure
- Gradual traffic shifting (10% β 25% β 50% β 75% β 100%)
- Automatic validation at each step
- Metrics-based rollback on failure
- Zero-downtime deployments
- Risk mitigation for critical apps
- Tracks deployment metrics
- Calculates Mean Time To Recovery
- Shows AI fix effectiveness
- Visual representation of canary progress
- Historical data for trend analysis
- Multi-stage Docker builds (optimized images)
- Non-root container execution
- Security scanning with Trivy
- RBAC controls
- Resource limits and requests
- Health probes (liveness, readiness, startup)
Developer pushes code to demo branch β GitHub Actions workflow triggers automatically.
- Security Scan - Trivy scans Docker image for vulnerabilities
- Unit Tests - Jest runs all tests in
/app/tests/- Tests check:
/statusendpoint, health checks, error handling - Coverage enforced at 70% minimum
- Tests check:
-
AI Analysis - GitHub Actions calls OpenAI API
- Analyzes test failure stack traces
- Generates root cause analysis
- Proposes code fixes
-
Auto-Fix Branch - Creates new branch:
ai-fix/test-failure-{timestamp}- Applies AI-generated fixes
- Adds documentation of changes
- Commits to new branch
-
Pull Request - Creates PR with:
- Detailed AI analysis
- Fix explanation
- Link to workflow run
- Request for review
-
Manual Review - Team reviews and approves/rejects
- Builds Docker image with multi-stage approach
- Scans Docker image for vulnerabilities
- Pushes to GitHub Container Registry (GHCR)
- Tags with:
branch-{commit_sha_short}(e.g.,demo-abc123de)
- Updates
manifests/ai-showcase/rollout.yamlwith new image tag - Updates
dashboard/mttr.jsonwith deployment record - Commits changes back to
demobranch
-
Change Detection - ArgoCD polls repository (default: every 3 minutes)
- Alternatively: Webhook triggers instant sync
-
Manifest Sync - ArgoCD applies manifests to Kubernetes cluster
- Creates/Updates resources in
ai-showcasenamespace - Matches desired state (Git) with live state (cluster)
- Creates/Updates resources in
-
Rollout Initiation - Argo Rollouts resource created/updated
- New ReplicaSet created with updated image
- Old ReplicaSet kept for rollback
- Traffic routing configured
Step 1: 10% Canary (5 minutes)
- 1 out of 10 pods runs new version
- 10% of traffic routed to canary
- Prometheus metrics collected
- Error rate checked: must be β€ 1%
- Response time checked: p95 must be β€ 500ms
Active Pods: [NEW, OLD, OLD, OLD, OLD, OLD, OLD, OLD, OLD, OLD]
Traffic: 10% β NEW, 90% β OLD
Step 2: 25% Canary (5 minutes)
- 2-3 out of 10 pods run new version
- 25% of traffic routed to canary
- Validation repeated
Step 3: 50% Blue-Green (5 minutes)
- Half the cluster on new version
- 50% traffic split
- Higher confidence in new version
Step 4: 75% Canary (5 minutes)
- 7-8 out of 10 pods run new version
- 75% traffic routed to canary
- Final validation
Step 5: 100% Rollout (Complete)
- All pods running new version
- 100% traffic to new version
- Old ReplicaSet retained for quick rollback
Automatic Rollback Example:
- If error rate > 1% at any step β automatically rollback
- If response time p95 > 500ms β automatically rollback
- Manual rollback available anytime
- Records deployment start/end times
- Calculates MTTR (minutes)
- Stores: commit hash, author, test status, AI fix status, canary progress
- Updates aggregate statistics
agentic-ai-demo/
βββ app/ # Node.js Express Application
β βββ app.js # Main Express server
β βββ package.json # Dependencies
β βββ jest.config.js # Jest test configuration
β βββ Dockerfile # Multi-stage production build
β βββ .dockerignore # Docker build exclusions
β βββ .env.example # Environment variables template
β βββ tests/
β βββ app.test.js # Unit tests (Jest)
β
βββ manifests/
β βββ ai-showcase/ # Kubernetes manifests (GitOps)
β βββ namespace.yaml # ai-showcase namespace
β βββ configmap.yaml # App config & MTTR schema
β βββ secret.yaml # API keys (reference only)
β βββ deployment.yaml # Standard Kubernetes Deployment
β βββ service.yaml # LoadBalancer Service
β βββ rollout.yaml # Argo Rollout (canary config)
β βββ rbac.yaml # ServiceAccount, Role, RoleBinding
β βββ argocd-app.yaml # ArgoCD Application & Project
β
βββ dashboard/
β βββ mttr.json # MTTR Dashboard data
β
βββ .github/
β βββ workflows/
β βββ ci-cd.yml # GitHub Actions Workflow
β β
β ββ Job 1: Security Scan (Trivy)
β ββ Job 2: Run Tests (Jest)
β ββ Job 3: AI Auto-Fix (on test failure)
β ββ Job 4: Build Docker Image
β ββ Job 5: Update Kubernetes Manifest
β ββ Job 6: Watch ArgoCD Sync
β ββ Job 7: Finalize Deployment
β
βββ README.md # This file
- Node.js 18.x or later
- npm 8.x or later
- Docker (for building images)
- kubectl (for Kubernetes operations)
- Kubernetes Cluster (1.24+)
- Options: EKS, GKE, AKS, or local (Minikube/Kind)
- Argo Rollouts installed in cluster
kubectl apply -n argo-rollouts -f \ https://github.com/argoproj/argo-rollouts/releases/download/stable/install.yaml
- ArgoCD installed in cluster
kubectl create namespace argocd kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
- Prometheus (optional, for metrics-based canary validation)
- Repository with GitHub Actions enabled
- Secrets configured in repository settings:
OPENAI_API_KEY- For AI-assisted fixesGITHUB_TOKEN- For creating PRs (default provided)DOCKER_REGISTRY_TOKEN- For GHCR push (if not using GitHub token)
- GitHub Container Registry (GHCR) OR Docker Hub
- Credentials configured in GitHub Actions
git clone https://github.com/your-org/your-repo.git
cd agentic-ai-demogit checkout demo
# Or create if doesn't exist:
git checkout -b democd app
npm install
npm install --save-dev # dev dependencies for testing
# Verify installation
npm testcp app/.env.example app/.env
# Edit .env with your values
nano app/.envcd app
docker build -t ghcr.io/your-org/ai-showcase-app:local .
docker run -p 3000:3000 ghcr.io/your-org/ai-showcase-app:local
# Test: curl http://localhost:3000/status# Go to: Settings β Secrets and variables β Actions
# Add:
# - OPENAI_API_KEY=sk-...
# - (GITHUB_TOKEN is provided by default)Replace these in all manifests:
your-orgβ Your GitHub organizationyour-repoβ Your repository nameyour-domain.comβ Your domain
# Batch replace (macOS):
find manifests -name "*.yaml" -exec sed -i '' 's|your-org|myorg|g' {} \;
find manifests -name "*.yaml" -exec sed -i '' 's|your-repo|myrepo|g' {} \;# Create namespace
kubectl create namespace ai-showcase
# Create registry secret for GHCR
kubectl create secret docker-registry ghcr-secret \
--docker-server=ghcr.io \
--docker-username=<github-username> \
--docker-password=<github-token> \
--docker-email=<email> \
-n ai-showcase
# Create app secrets
kubectl apply -f manifests/ai-showcase/secret.yaml# Create ArgoCD app
kubectl apply -f manifests/ai-showcase/argocd-app.yaml
# Verify
kubectl get application -n argocdgit add .
git commit -m "feat: Add new feature"
git push origin demo
# GitHub Actions automatically triggers
# Check progress: GitHub β Actions tab# In GitHub UI:
# Actions β CI-CD Workflow β Run workflow β Select 'demo' branchIn GitHub:
- Go to
Settings β Actions - Click on latest workflow run
- View real-time logs:
- Security Scan
- Test Results
- Docker Build
- Manifest Updates
In Kubernetes:
# Watch manifest updates
kubectl get rollout -n ai-showcase -w
# View pod status
kubectl get pods -n ai-showcase -w
# Check canary progress
kubectl get rollout ai-showcase-app -n ai-showcase -o yaml | grep -A 10 'status'
# View events
kubectl get events -n ai-showcase --sort-by='.lastTimestamp'In ArgoCD UI:
# Port-forward to ArgoCD
kubectl port-forward -n argocd svc/argocd-server 8080:443
# Open: https://localhost:8080
# Username: admin
# Password: $(kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d)
# Watch application syncAccess dashboard/mttr.json to see deployment metrics
π https://devopssessionsjvr.github.io/agentic-ai-demo/
# Helm chart for monitoring setup (optional):
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring
# View metrics
kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090
# Open: http://localhost:9090Cause: OpenAI API key not configured
Solution:
# Verify secret exists
kubectl get secret -n ai-showcase ai-showcase-secrets
# Check GitHub secret
# Settings β Secrets β OPENAI_API_KEYCause: ArgoCD application not found or repo not accessible
Solution:
# Check ArgoCD app
kubectl get application -n argocd
# Check ArgoCD logs
kubectl logs -n argocd deployment/argocd-application-controller
# Manually sync
argocd app sync ai-showcase --grpc-webCause: Metrics validation failing
Solution:
# Check Argo Rollout status
kubectl describe rollout ai-showcase-app -n ai-showcase
# View analysis results
kubectl get analysis -n ai-showcaseFor questions or issues:
- Check Troubleshooting section
- Review logs:
kubectl logs -n ai-showcase -l app=ai-showcase - Check GitHub Actions workflow runs
- Open GitHub issue with logs
Built with β€οΈ for DevOps Excellence Version: 1.0.0 | Status: Production Ready β