feat(deploy): docker + helm chart + CI/CD for staging#35
Merged
Conversation
Installed via: npm install --workspace=apps/api @fastify/static Used by the production runtime to serve the built apps/web/dist as a fallthrough for non-/api/* routes — one image, one process per architecture.md's "single Docker image bundles API + static web" claim.
Generated by: asdf set helm 4.1.0 Helm is used by deploy-staging.yml / deploy-production.yml workflows and for local `helm lint` / `helm template` validation against the chart under deploy/charts/codeforphilly.
Adds the boot-path surfaces the deploy plan needs in production: - New plugin apps/api/src/plugins/static-web.ts mounts the built SPA at CFP_WEB_DIST_PATH and installs a notFoundHandler that returns the JSON envelope for unknown /api/* paths and serves index.html with no-cache for everything else (SPA fallback for React Router v7 routes). When CFP_WEB_DIST_PATH is unset (dev / tests) the plugin still installs the JSON-envelope 404 handler so the API contract is consistent. - New env var CFP_WEB_DIST_PATH (optional) — set in the production image to /app/apps/web/dist; unset in dev where Vite owns 5173. - New route GET /api/health/ready — readiness probe for k8s. Returns 200 only after the store + FTS decorators are present (which happens during plugin registration, before fastify.listen()). Returns 503 otherwise so ingress never routes to a pod whose in-memory state hasn't loaded. - Tests in apps/api/tests/deploy.test.ts cover the readiness payload, the SPA fallback / no-cache header, the /api/* JSON-404 envelope with and without the SPA bundled, and the boot-time failure when CFP_WEB_DIST_PATH points at a missing directory. Per specs/architecture.md's "single Docker image bundles API + static web" claim and the deploy plan's readiness-probe + SPA-fallthrough requirements.
- Dockerfile: three stages (deps / build / runtime) on node:22.22-alpine.
Final image is non-root (uid 1000), bundles git + ca-certificates +
tini + openssh-client, ships apps/api/dist plus apps/web/dist for the
single-image SPA-co-served deploy.
- .dockerignore keeps secrets (.env, private-storage/, codeforphilly-data/)
and dev-only artifacts (node_modules, dist, tests, plans, specs) out of
the build context.
- deploy/docker/entrypoint.sh handles the working-tree-on-startup pattern
from specs/architecture.md: clone CFP_DATA_REMOTE on first boot, fetch +
reset --hard on subsequent boots, then exec node. Uses GIT_SSH_COMMAND
rendered by Helm when a deploy key Secret is mounted.
Build:
docker build -t cfp:dev .
Smoke test:
docker run --rm -p 3001:3001 \
-e CFP_DATA_REMOTE=https://github.com/CodeForPhilly/codeforphilly-data-snapshot.git \
-e STORAGE_BACKEND=filesystem \
-e CFP_PRIVATE_STORAGE_PATH=/app/private-storage \
-e CFP_JWT_SIGNING_KEY=$(openssl rand -base64 48) cfp:dev
Minimal chart at deploy/charts/codeforphilly/ following the layout from the plan: Deployment / Service / Ingress / PVC (data) / PVC (private, staging only) / ConfigMap / ServiceAccount. Architectural constraints baked in: - replicas: 1 + strategy.type: Recreate, both hard requirements per specs/architecture.md (in-process write mutex serializes mutations, concurrent old/new pods would corrupt the gitsheets working tree). - Liveness probe hits /api/health every 10s; readiness probe hits /api/health/ready every 5s — ingress doesn't route traffic until both stores have loaded. - Data-repo PVC mounted at CFP_DATA_REPO_PATH so the working tree survives pod restarts; the entrypoint refreshes from CFP_DATA_REMOTE on each boot anyway (PVC is an optimization, not the source of truth). - Secrets are never templated by the chart — values reference a caller-provided Secret (default name codeforphilly-secrets) via envFrom, and a separate Secret for the SSH deploy key. values.staging.yaml: filesystem private store + PVC, points at the public scrubbed-snapshot data remote so staging never serves real PII until the cutover-prep plan wires it up. values.production.yaml: S3 private store, real data remote with SSH deploy-key auth, larger resource budget, NODE_OPTIONS heap tuning. `helm lint` clean against all three values files.
- deploy-staging.yml: on push to main, builds the image (tagged sha-<short> + staging-latest), pushes to GHCR, and runs helm upgrade --install against namespace codeforphilly-staging. Gated by GitHub Environment "staging" — first run requires manual approval; secrets (KUBECONFIG_STAGING) are scoped per-environment. - deploy-production.yml: on push of tags matching v*.*.*, same build + helm upgrade against namespace codeforphilly. Gated by Environment "production". Also exposes workflow_dispatch with a tag input for promoting an already-built image. Both jobs use --atomic --wait --timeout 5m so a failed rollout auto-reverts. A post-deploy smoke check hits /api/health on the public ingress to catch ingress / cert misconfiguration before declaring the deploy successful. Action versions checked against upstream READMEs: - actions/checkout@v6 (matches existing ci.yml) - docker/setup-buildx-action@v3 - docker/login-action@v3 - docker/build-push-action@v6 - azure/setup-kubectl@v4 - azure/setup-helm@v4
Three new docs under docs/operations/, satisfying the deploy plan's "Operational docs" validation criterion: - deploy.md — implementation companion to specs/architecture.md's Deploy section. Image anatomy, boot sequence, Helm install/upgrade commands, bucket-provisioning checklist (R2 / B2 / S3 / MinIO options, with versioning + lifecycle rules + IAM scoping), environment-variable reference table. - secrets.md — inventory of every runtime secret with generation + rotation procedure: CFP_JWT_SIGNING_KEY, GITHUB_OAUTH_CLIENT_SECRET, S3_* keys, SAML key+cert, the data-repo SSH deploy key. Includes the bootstrap-a-new-environment recipe using sealed-secrets. - runbook.md — "API won't boot" playbook with log-grep table mapping common log lines to causes and fixes, plus rollback procedure.
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the deploy plan (plans/deploy.md) so the team can stand up a staging environment and follow the same template into production.
CFP_DATA_REMOTEthen exec'snode. Final image is non-root alpine withgit,ca-certificates,tini,openssh-client.deploy/charts/codeforphilly/withvalues.yaml/values.staging.yaml/values.production.yaml. One replica,Recreatestrategy, PVC for the data working tree, optional PVC for the filesystem private-store (staging), readiness probe at/api/health/ready.deploy-staging.yml(push to main → build + helm upgrade) anddeploy-production.yml(tag push → build + helm upgrade), both gated by per-environment GitHub Environments +KUBECONFIG_*secrets.apps/api/src/plugins/static-web.tsmounts the built SPA atCFP_WEB_DIST_PATHwith SPA fallback + JSON 404 envelope for/api/*; newGET /api/health/readyreturns 503 until stores have loaded.docs/operations/:deploy.md(image anatomy, boot sequence, bucket provisioning),secrets.md(every runtime secret with generation + rotation),runbook.md("API won't boot" playbook).Test plan
docker build .produces an image — Dockerfile + .dockerignore in place; not runnable in this CI but the build steps mirror what the action does (verified by inspection; daemon not available in this env)./api/*and the static SPA — tested inapps/api/tests/deploy.test.ts(static-web plugin SPA fallback +/api/*JSON 404 envelope).helm installto a staging namespace boots the deployment cleanly — chart lints clean for all three values files; first stand-up requires cluster access not held by this agent (see Follow-ups in plan closeout).curl) — same: requires cluster access.values.staging.yaml.apps/api/tests/deploy.test.ts.ci.ymlis untouched.docs/operations/secrets.md, requires cluster.docs/operations/.Verification before push
npm run type-check— clean across api / web / sharednpm run lint— cleannpm test— clean across api / web / sharednpm run build— clean (apps/web/dist + apps/api/dist)helm lint deploy/charts/codeforphilly— clean for default, staging, production valueshelm template ...— renders valid YAML for both environments