Skip to content

Phase 2 Plan

Captain Dany edited this page Jun 11, 2026 · 1 revision

Phase 2 Plan — Production Readiness

Overview

The first CD pipeline deploy to dev succeeded (app is running and healthy). This plan covers the remaining work to get from first deployment → production-ready.

Status Legend

  • ✅ Done
  • 🔜 Next (this session)
  • ⏳ Short-term (next few sessions)
  • 🗓️ Medium-term
  • 🔮 Future

🔜 Step 1: DNS Configuration (Blocking Everything)

Goal: Make oscar-crm.cc resolve to the cluster.

Task Status Notes
Import BIND zone file to Cloudflare 🔜 File at deploy/dns/oscar-crm.cc.zone
Wait for DNS propagation 🔜 5-30 minutes
Verify dev.oscar-crm.cc resolves 🔜 dig dev.oscar-crm.cc should return 159.54.137.54
Re-run CD pipeline smoke test 🔜 Should pass once DNS resolves

⏳ Step 2: TLS Certificate Issuance

Goal: HTTPS works without browser warnings.

Task Status Notes
Wait for cert-manager auto-reconciliation Automatically retries once DNS resolves
Monitor certificate status kubectl get certificate -n oscar-dev -w
Verify HTTPS endpoint curl https://dev.oscar-crm.cc/health returns 200

⏳ Step 3: Smoke Test Passes

Goal: CD pipeline green from end to end.

Task Status Notes
CD pipeline smoke test passes curl https://dev.oscar-crm.cc/health from runner
CD pipeline completes without warnings All jobs green
Enable deployment protection on dev Add required reviewers?

⏳ Step 4: Postgres Automation

Goal: Database lifecycle managed by Helm or operator.

Task Status Notes
Decide: Helm-managed (simple) vs CloudNativePG operator (production)
If Helm: enable postgresql.enabled=true with persistent volumes
If CNPG: deploy operator, create Cluster resource 🗓️
Configure automated backups 🗓️
Create app database user (not postgres superuser)

⏳ Step 5: Staging Environment

Goal: Manual gate → deploy to oscar-staging namespace.

Task Status Notes
Create oscar-staging namespace
Create secrets (ghcr-pull, oscar-app-secret)
Deploy Postgres
Approve staging gate in CD pipeline Requires manual approval on GitHub
Verify staging smoke test

🗓️ Step 6: Production Environment

Goal: Production deployment with manual approval.

Task Status Notes
Create oscar-production namespace 🗓️
Production Postgres (dedicated DBaaS recommended) 🗓️ OCI Base Database Service or similar
Create production secrets 🗓️ Different app secret, DB creds
Run CD pipeline with production gate 🗓️
Configure monitoring & alerts 🗓️

🗓️ Step 7: CI Improvements

Task Status Notes
Integration tests with test DB 🗓️ Docker Compose in CI
E2E smoke tests 🗓️ Playwright/cypress
Dependency updates automation (Renovate) 🗓️
SARIF upload for CodeQL results 🗓️ Already partially configured

🗓️ Step 8: Observability

Task Status Notes
Install Prometheus + Grafana 🗓️
Configure structured logging 🗓️
Set up alerting rules 🗓️
Dashboard for DORA metrics 🔮

🗓️ Step 9: Security Hardening

Task Status Notes
Switch to OIDC for ghcr.io auth 🗓️
Pod Security Standards (restricted) 🗓️
Network policies 🗓️
Secrets rotation workflow 🗓️

🔮 Step 10: Infrastructure Improvements

Task Status Notes
Install OCI CCM (LoadBalancer support) 🔮 Enables proper LB IP instead of hostNetwork
Multi-node cluster 🔮 For HA
Cluster autoscaling 🔮
Terraform/Pulumi for infra-as-code 🔮
Disaster recovery plan 🔮

Immediate Next Actions (Today)

  1. Import DNS zone file to Cloudflare
  2. Wait for propagation (check with ping or dig)
  3. Verify curl https://dev.oscar-crm.cc/health returns {"status":"healthy"}
  4. Re-run CD pipeline from GitHub Actions to pass smoke test

Risk Items

Risk Likelihood Impact Mitigation
cert-manager HTTP-01 fails after DNS config Medium Medium Fallback to DNS-01 with Cloudflare API token
Postgres crashes on OKE VM (no persistence) High High Add PVC to postgres pod, or switch to DBaaS
Single node = single point of failure High High Accept for MVP; multi-node in Phase 3
Let's Encrypt rate limits Low Medium Use staging issuer for testing