Add DEPLOY.md — document deploy process and infrastructure#24
Conversation
WalkthroughAdds Changes
Sequence Diagram(s)sequenceDiagram
participant Dev as Developer
participant GH as GitHub Actions (ecr.yml)
participant Sentry as Sentry
participant ECR as Amazon ECR
participant CP as AWS CodePipeline
participant CD as AWS CodeDeploy (ECS)
participant ECS as ECS Fargate
participant ALB as ALB
participant Client as Client
Dev->>GH: manual dispatch "push to ecr: production"
GH->>Sentry: upload source maps
GH->>ECR: push multi-arch Docker image (tags: latest, branch, sha)
GH->>Dev: post-build verification (aws ecr describe-images)
Dev->>CP: aws codepipeline start-pipeline-execution (thumbnail-api-pipeline)
CP->>ECR: fetch image/artifact metadata
CP->>CD: trigger blue/green deploy (ECSAllAtOnce)
CD->>ECS: deploy new task set (green)
ECS->>ALB: register green targets (90s slow start)
ALB->>ECS: health checks
Client->>ALB: curl /health
ALB-->>Client: 200 OK or 404/502/400 (failure indicators)
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Documents the manual two-phase deploy (ECR build via GH Actions, then CodePipeline), blue/green CodeDeploy setup, 90s ALB slow-start on both target groups, and the preferred combined deploy script for deploying alongside the DPLA API in one maintenance window. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
62bdf71 to
73c745a
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@DEPLOY.md`:
- Around line 94-95: Update the DEPLOY.md health-check text for the /thumb/*
endpoint: replace the phrase "Expect HTTP 200 or 302 (redirect to upstream image
URL)" with "Expect HTTP 200" and mention that failures may return 404, 502, or
400 depending on path/input; ensure the doc reflects that upstream redirects are
translated to 502 rather than returning 302 for /thumb/*.
- Around line 79-83: The doc claims specific deployment settings (ECSAllAtOnce,
5-minute blue task termination, 90-second slow start for
thumbnail-api-tg-blue/thumbnail-api-tg-green, and auto-rollback) but there is no
source-of-truth in code; update either by (a) adding a link to the authoritative
IaC or AWS resource (CloudFormation/Terraform stack or console permalink) with
last-verified date that shows these settings (referencing appspec_template.yaml
and the ALB target groups thumbnail-api-tg-blue/thumbnail-api-tg-green and
deployment config ECSAllAtOnce), or (b) change the text to state these were
"verified in AWS console on [date]" and note that appspec_template.yaml does not
enforce them; also apply the same change for the similar statements in the later
block (lines 110–112).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
- Fix health check: service never returns 302 (fetch follows redirects internally); document actual failure codes (404, 502, 400) - Add verification dates for blue/green settings (ECSAllAtOnce, 5-min blue termination, 90s slow start, auto-rollback) — these are managed in AWS console, not enforced by appspec_template.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
DEPLOY.md (2)
89-95: Use/healthas the primary readiness check; keep/thumb/*as functional smoke test.The current check relies on a specific content ID that can legitimately change behavior over time, which can cause noisy false alarms after deploy.
Suggested doc tweak
+curl -sf "https://thumb.dp.la/health" | jq . +# Expect HTTP 200 + curl -sf -o /dev/null -w "HTTP %{http_code}\n" \ "https://thumb.dp.la/thumb/f293d15b0515ac8a5478cbd9c02af79c" -Expect HTTP 200. Failures return 404 (item not found or upstream 404/410), 502 (other upstream error), or 400 (malformed ID). The service follows upstream redirects internally — 302 is never returned to clients. +Then run the `/thumb/*` call as a functional smoke test. +Expect HTTP 200 for a healthy hit; failures may return 404/502/400 depending on item/upstream/input. The service follows upstream redirects internally — 302 is never returned to clients.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@DEPLOY.md` around lines 89 - 95, Update the deployment docs to use /health as the primary readiness check and keep the existing /thumb/* example as a secondary functional smoke test: replace the curl readiness example that targets the specific content ID URI ("https://thumb.dp.la/thumb/f293d15b0515ac8a5478cbd9c02af79c") with a curl against /health, document that /health should return 200 for readiness, and retain the current /thumb/* curl (and its expected HTTP codes/behaviour) as a separate smoke-test paragraph so operators use /health for automated readiness and /thumb/* only for manual functional verification.
11-15: Document where~/bin/deploy-api-servicescomes from.This path is machine-local and may not exist for other operators. Add a source/location (repo path or setup doc) to keep this runbook reproducible.
Suggested doc tweak
~/bin/deploy-api-services # deploy both ~/bin/deploy-api-services thumb # thumbnail-api only ~/bin/deploy-api-services api # api only + +If this script is not installed locally, follow: <link to script/setup instructions>.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@DEPLOY.md` around lines 11 - 15, The README currently references the machine-local command `~/bin/deploy-api-services` without saying where that script comes from; update DEPLOY.md to state the source/location of that script (e.g., repository path or setup doc that installs it), include the repo file or package name (or the setup script that places `deploy-api-services` into `~/bin`), and add brief install or symlink instructions so operators can reproduce `~/bin/deploy-api-services` on new machines (mention the exact filename `deploy-api-services` and any required permissions or environment variables).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@DEPLOY.md`:
- Around line 89-95: Update the deployment docs to use /health as the primary
readiness check and keep the existing /thumb/* example as a secondary functional
smoke test: replace the curl readiness example that targets the specific content
ID URI ("https://thumb.dp.la/thumb/f293d15b0515ac8a5478cbd9c02af79c") with a
curl against /health, document that /health should return 200 for readiness, and
retain the current /thumb/* curl (and its expected HTTP codes/behaviour) as a
separate smoke-test paragraph so operators use /health for automated readiness
and /thumb/* only for manual functional verification.
- Around line 11-15: The README currently references the machine-local command
`~/bin/deploy-api-services` without saying where that script comes from; update
DEPLOY.md to state the source/location of that script (e.g., repository path or
setup doc that installs it), include the repo file or package name (or the setup
script that places `deploy-api-services` into `~/bin`), and add brief install or
symlink instructions so operators can reproduce `~/bin/deploy-api-services` on
new machines (mention the exact filename `deploy-api-services` and any required
permissions or environment variables).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
DEPLOY.md (1)
89-97: Add a deterministic/healthcheck before/thumb/*smoke test.Line 92’s
/thumb/*check can fail for upstream/content reasons even when deployment is healthy. Recommend documenting/healthas the primary readiness check, then/thumb/*as functional smoke coverage.Suggested doc update
## Post-deploy health check ```bash +curl -sf -o /dev/null -w "HTTP %{http_code}\n" "https://thumb.dp.la/health" +``` + +Expect HTTP 200. + +Functional smoke check: + +```bash curl -sf -o /dev/null -w "HTTP %{http_code}\n" \ "https://thumb.dp.la/thumb/f293d15b0515ac8a5478cbd9c02af79c"-Expect HTTP 200. Failures return 404 (item not found or upstream 404/410), 502 (other upstream error), or 400 (malformed ID). The service follows upstream redirects internally — 302 is never returned to clients.
+For/thumb/*, expect HTTP 200 on success. Failures return 404 (item not found or upstream 404/410), 502 (other upstream error), or 400 (malformed ID). The service follows upstream redirects internally — 302 is never returned to clients.</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@DEPLOY.mdaround lines 89 - 97, Update DEPLOY.md to add a deterministic
readiness check: document a primary curl against "https://thumb.dp.la/health"
that should return HTTP 200, and keep the existing "/thumb/*" curl as a
secondary functional smoke test; ensure the doc shows the /health check first
with its expected 200 response and then the /thumb/f293... example with the
existing explanation of possible 404/502/400 outcomes and the note about
upstream redirects.</details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@DEPLOY.md:
- Around line 39-48: The doc currently verifies the image using a mutable tag
(imageTag=latest) which risks checking the wrong image; update the aws ecr
describe-images invocation for repository "thumbnail-api" to use the immutable
commit SHA tag (replace imageTag=latest with imageTag=<COMMIT_SHA>) and expand
the --query to return pushedAt, imageDigest and imageTags (for example change
--query 'imageDetails[0].imagePushedAt' to --query
'imageDetails[0].{pushedAt:imagePushedAt,digest:imageDigest,tags:imageTags}') so
the verification checks the exact image pushed by the workflow.
Nitpick comments:
In@DEPLOY.md:
- Around line 89-97: Update DEPLOY.md to add a deterministic readiness check:
document a primary curl against "https://thumb.dp.la/health" that should return
HTTP 200, and keep the existing "/thumb/*" curl as a secondary functional smoke
test; ensure the doc shows the /health check first with its expected 200
response and then the /thumb/f293... example with the existing explanation of
possible 404/502/400 outcomes and the note about upstream redirects.</details> <details> <summary>🪄 Autofix (Beta)</summary> Fix all unresolved CodeRabbit comments on this PR: - [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended) - [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes </details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Organization UI **Review profile**: CHILL **Plan**: Pro **Run ID**: `1d76e548-97c6-42c9-a85a-efc537e6f428` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between b9993ebb32228edea585bce3291e552ecfba2a42 and 33abb6baac68780d9b3006dc2bfba8bc24c0558b. </details> <details> <summary>📒 Files selected for processing (1)</summary> * `DEPLOY.md` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
DEPLOY.md (1)
39-50:⚠️ Potential issue | 🟠 MajorUse commit SHA as the default verification path (not
latest).The runbook still defaults to a mutable tag, which can validate the wrong artifact if another run updates
latestbefore Phase 2. Make SHA-tag verification the primary command, and keeplatestonly as optional context.Suggested doc update
-Verify the new image after the action completes: +Verify the exact image after the action completes (use the workflow commit SHA tag): ```bash aws ecr describe-images \ --repository-name thumbnail-api \ - --image-ids imageTag=latest \ + --image-ids imageTag=<COMMIT_SHA> \ --region us-east-1 \ - --query 'imageDetails[0].imagePushedAt' \ + --query 'imageDetails[0].{pushedAt:imagePushedAt,digest:imageDigest,tags:imageTags}' \ --output text
-The timestamp should match the current deployment. If you need to confirm the exact image (e.g. if another workflow may have updated
latestconcurrently), use the commit SHA tag from the workflow run output instead ofimageTag=latest.
+Uselatestonly for a quick sanity check, not as deployment proof.</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@DEPLOY.mdaround lines 39 - 50, Update the verification step so the default
uses the immutable commit SHA tag instead of the mutable imageTag=latest: in the
aws ecr describe-images command replace imageTag=latest with
imageTag=<COMMIT_SHA> (sourced from the workflow run) and expand the --query
from 'imageDetails[0].imagePushedAt' to return pushedAt, imageDigest and
imageTags (e.g.
imageDetails[0].{pushedAt:imagePushedAt,digest:imageDigest,tags:imageTags});
keep the imageTag=latest example only as an optional quick sanity check and
update the surrounding text to state that SHA is the primary verification
method.</details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In@DEPLOY.md:
- Around line 39-50: Update the verification step so the default uses the
immutable commit SHA tag instead of the mutable imageTag=latest: in the aws ecr
describe-images command replace imageTag=latest with imageTag=<COMMIT_SHA>
(sourced from the workflow run) and expand the --query from
'imageDetails[0].imagePushedAt' to return pushedAt, imageDigest and imageTags
(e.g.
imageDetails[0].{pushedAt:imagePushedAt,digest:imageDigest,tags:imageTags});
keep the imageTag=latest example only as an optional quick sanity check and
update the surrounding text to state that SHA is the primary verification
method.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Organization UI **Review profile**: CHILL **Plan**: Pro **Run ID**: `97df460c-554d-47d6-89e8-c0904de1be4c` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between b9993ebb32228edea585bce3291e552ecfba2a42 and 156ebc91e07dd7acd5e5c3c193273755cb88edf2. </details> <details> <summary>📒 Files selected for processing (1)</summary> * `DEPLOY.md` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Summary
thumbnail-api-tg-blueandthumbnail-api-tg-green(prevents cold-start errors on new tasks)~/bin/deploy-api-servicesas the preferred way to deploy both API services together in one maintenance windowTest plan
🤖 Generated with Claude Code
Summary by CodeRabbit