-
Notifications
You must be signed in to change notification settings - Fork 0
18 continuous delivery and deployment
📖 This page is generated from
modules/18-continuous-delivery-and-deployment/README.md. Edit the source, not the wiki; edits here are overwritten on the next sync. Run the hands-on labs from the repo, linked inline.
⬅ Previous: Module 17: Secrets, Config, and Environments
Merged isn't running. This module closes the last gap in the pipeline: getting approved code from
mainto something actually serving traffic, automatically, with a way back when it's wrong.
- Module 10: Reviewing Code You Didn't Write. The PR review gate. Auto-deploy is only safe because a human (or an agent under supervision) signed off on the diff first.
- Module 14: Continuous Integration. You already have a pipeline that lints, builds, and tests on every push. CD is not a new system; it's more stages on that same pipeline, after the checks pass.
- Module 15: Security Scanning. Dependency, secret, and static-analysis gates on the same pushes. These are part of what makes shipping without a human in the loop survivable.
-
Module 16: Containers and Reproducible Environments. The container image is what you ship.
CD takes that image and runs it somewhere. This module assumes you can already build and tag an
image of the
tasks-app. - Module 17: Secrets, Config, and Environments. A running service needs configuration and secrets at runtime, what it needs to run. CD wires those into the deploy step instead of baking them into the image.
If you've done 14–17, you have all the parts. This module is the assembly.
By the end of this module you can:
- State the precise difference between continuous delivery and continuous deployment, and decide which one a given project should use.
- Extend your CI pipeline with build-and-publish stages that turn a merge into a versioned, deployable artifact.
- Wire a deploy step that takes that artifact, injects runtime config/secrets, and brings up the new version, provider-neutrally.
- Add a health check and an automatic rollback so a bad deploy reverts itself instead of staying down.
- Reason about the deploy gate the way this audience already reasons about change windows: what's automated, what's manual, and where the stop button is.
Walk the pipeline you've built so far. A change gets proposed (Module 9), implemented on a branch
(Module 6), reviewed as a PR (Module 10), checked by CI (Module 14), scanned for vulnerabilities
(Module 15). It merges. main is now correct, tested, and clean.
And then nothing happens. The code that's "done" is sitting in a Git history. The thing your users touch is still running last week's version. Somebody (usually you, usually at 6pm) has to SSH in, pull, build, restart, and pray. That manual last mile is where most outages are actually born: inconsistent steps, a forgotten config flag, a half-restarted service, "wait, which version is in prod right now?"
CI answered "is this change good?" CD answers the next question: "now get the good change running, the same way every time." It's the same instinct that made CI worth it, the one that replaces an error-prone manual ritual with an automated, repeatable one, now pointed at the last step.
These two terms get used interchangeably and they are not the same thing. The difference is exactly one decision: who pushes the button to prod.
-
Continuous Delivery: every merge to
mainautomatically produces a deployable artifact (a built, tagged, tested container image, sitting in a registry) and deploys it as far as a staging/pre-prod environment. Production deploy is one click by a human. The pipeline guarantees the artifact is ready to ship at any moment; a person decides when. -
Continuous Deployment: same pipeline, but there's no button. If it passes every gate, it goes all the way to production automatically. Merge is the last human action.
merge to main
│
┌─────────────┴──────────────┐
CONTINUOUS DELIVERY CONTINUOUS DEPLOYMENT
│ │
build + test + scan build + test + scan
│ │
publish artifact publish artifact
│ │
deploy to staging deploy to staging
│ │
[human clicks "ship"] ──► deploy to prod (automatic)
│ │
deploy to prod done
Both are "CD." When someone says "we do CD," ask which one; the operational risk is completely different. Continuous deployment is not the more advanced/better option you graduate to; it's a different risk posture that's appropriate for some systems and reckless for others. A blog, internal dashboard, or stateless web service with good tests is a fine candidate. A billing engine, a database migration, or anything with a regulatory change-control requirement usually is not, and "a human clicks deploy" is a perfectly mature answer there, not a failure to automate.
The honest default for most teams adopting this: start with continuous delivery. Get the artifact and the deploy step fully automated and trustworthy, keep the human on the prod button, and remove that button only once you trust the gates more than you trust the click.
Here's the discipline that makes CD reliable, and it comes straight from Module 16: you deploy a
built image, not a Git ref. "Deploy main" is ambiguous; it means "go to the prod box, pull,
and rebuild," and that rebuild can pull a different base image or dependency version than CI tested.
"Deploy tasks-app:9f3a2c1" is not ambiguous. It's the exact bytes CI built and tested.
So the build-and-publish stage does this once, centrally:
- Build the image from the merged code.
- Tag it with something immutable and traceable: the Git commit SHA is the standard choice
(
tasks-app:9f3a2c1). Optionally also a moving tag like:latestor:stagingfor convenience, but the SHA tag is the one you trust. - Push it to a container registry, the durable home for images the same way a Git remote (Module 8) is the durable home for commits.
Every later deploy (to staging, to prod, a rollback) just says "run this tag." Build once, run the identical artifact everywhere. That single property is what kills "works on my machine" at the deploy layer.
The shape of a deploy is the same everywhere, whatever the target (a cloud platform, a Kubernetes cluster, a single VM, a PaaS):
- Pull the specific image tag onto the target.
- Inject runtime config and secrets (Module 17): environment variables, mounted secret files, a secrets-manager lookup. Never baked into the image; supplied at run time so the same image runs in staging and prod with different config.
- Start the new version alongside or in place of the old one.
- Health-check it before sending real traffic.
- Cut over if healthy; roll back if not.
This module is deliberately provider-agnostic on where, the same way Module 8 stayed neutral on
hosts. The mechanics differ (a kubectl apply, a platform CLI, a docker run, a compose up), but
the five steps don't. The lab does the simplest possible real version: a local container run. The
logic is identical at scale.
A deploy that can't tell whether it worked isn't a deploy, it's a gamble. The single most important thing CD adds over "SSH in and restart" is that the pipeline verifies the new version is alive before trusting it, and reverses itself when it isn't.
A health check is a cheap, honest signal that the new version is actually serving: typically an
endpoint like /health that returns 200 only when the app has started clean. The deploy step
hits it after starting the new version and waits for green before cutting over.
Rollback is the other half. If the health check fails, the deploy stops the broken new version and
brings the previous known-good image tag back up. Because you deploy immutable tags, rollback is
trivial: you still have tasks-app:<previous-sha>, so "go back" is just "run the old tag again."
No rebuild, no git revert race, no scramble. (Reverting the source is still Module 12's job for the
code; rollback here is about the running artifact.) The strategies have names you'll meet:
blue-green (run old and new side by side, flip a switch) and canary (send 5% of traffic to new,
watch, ramp). They're all variations on "keep the old one ready until the new one proves itself."
Reframe for the ops reader: you already know this instinct. It's the deployment equivalent of a maintenance window with a back-out plan, except the back-out plan is automated, tested on every single deploy, and takes seconds instead of a panicked hour. CD doesn't remove the discipline you already have; it encodes it so it runs every time instead of only when someone remembers.
CI existed long before AI, and so did CD. What changed is the rate, and rate is everything for the merged-to-prod gate.
AI writes and ships changes dramatically faster. More PRs open, more merge, and they merge sooner. That's the upside, and it means the volume of code flowing toward production goes up, while the human attention available to babysit each deploy stays flat. The gap between "merged" and "in prod" stops being a quiet formality and becomes the place where that speed either pays off or hurts you.
Two consequences follow, and they pull in opposite directions:
- Automating the deploy matters more. If a human has to hand-deploy every AI-generated change, the manual last mile becomes the bottleneck that eats all the speed AI just gave you. CD is what lets the throughput actually reach users.
- The gate matters more. Faster shipping of code that looks right (the recurring AI failure mode from Modules 1 and 14) means a bad change reaches prod faster too, unless something catches it. This is the crucial point: continuous deployment is only survivable because of the gates in front of it. Review (Module 10), CI tests (Module 14), and security scanning (Module 15) are not bureaucracy you tolerate. They are the entire reason you're allowed to remove the human from the deploy button. Take auto-deploy without those gates and you've built a machine that ships AI mistakes to production at full speed.
So the AI-era posture is specific: strengthen the early gates, then automate the late ones. The more you trust review + CI + scanning, the further right you can safely push automation, up to and including no human on the prod button. The strength of the gates is the dial that decides whether continuous deployment is responsible or reckless for a given repo. And when an agent itself is the one merging (Unit 5), this stops being theoretical: the deploy gate is the last thing standing between an autonomous contributor and your users.
Starting point (this lab is skip-friendly). This lab is self-contained and does not depend on the earlier labs. Its files live in
modules/18-continuous-delivery-and-deployment/lab/. Copy them into a working folder and make a first commit so you start clean:cp -r ~/ai-workflow-course/modules/18-continuous-delivery-and-deployment/lab ~/ai-workflow-course/18-continuous-delivery-and-deployment-lab cd ~/ai-workflow-course/18-continuous-delivery-and-deployment-lab && git init -b main && git add -A && git commit -m "start: module 18"
Lab language: shell, driving the container tooling from Module 16. You'll extend the tasks-app
into a tiny running service, then build a deploy script that ships it locally with a health check and
automatic rollback, the whole CD motion simulated on your own machine.
This lab simulates deployment with a local container run so it works on any machine with no cloud account. The five deploy steps are real; only the target is your laptop instead of a server.
You'll need:
- A container runtime from Module 16: Docker or Podman. (Commands below use
docker; if you run Podman,alias docker=podmanor substitute.) As in Module 16, the engine must be running before you build or deploy. On macOS/Windows start Docker Desktop (orpodman machine start);docker --versionsucceeds even when the engine is stopped, so confirm it's live withdocker infofirst, ordeploy.sh's build step fails with "Cannot connect to the Docker daemon." - The
tasks-appfrom Modules 1–2, now a Git repo. -
curl(for the health check) and a bash-capable shell. On Windows, use WSL or Git Bash. - Claude Code (sub your own agent), editor-integrated as of Module 4. From here you direct it to do the setup, commit, build, and deploy work, then you verify the result; you don't type those commands by hand.
Starter files are in this module's lab/ folder:
-
serve.py: turns thetasks-appinto a minimal HTTP service with a/healthendpoint, using only the Python standard library (no dependencies). This is the long-running thing CD deploys. -
Dockerfile: the Module 16 container image, adjusted to run the service. -
deploy.sh: the deploy step: build, tag, run, health-check, cut over or roll back. -
cd-starter.yml: the CD pipeline stages, written as GitHub Actions and extending the Module 14 CI file. GitLab/other-forge notes are in the comments.
A CLI that exits immediately is awkward to "deploy." Give the app a long-running face.
-
Direct Claude Code to bring the starter files into your
tasks-appfolder next totasks.pyandcli.py: "Copyserve.py,Dockerfile, anddeploy.shfrom this module'slab/into the tasks-app folder." Then readserve.pyyourself; it's ~40 lines wrapping theTaskListyou already have in a stdlib HTTP server with two routes,/healthand/tasks. Verify the three files landed next totasks.py/cli.py. -
Run the service locally first, no container, to see it work:
python3 serve.py # serves on http://localhost:8000In another terminal:
curl localhost:8000/health # {"status": "ok", "version": "dev"} curl localhost:8000/tasks # your tasks as JSON
Stop it with Ctrl-C. Now have Claude Code commit the new files: "Stage and commit the HTTP service and Dockerfile with a clear message." Verify the commit before moving on: read the diff it staged and confirm no secret, state file, or junk got swept in (it should be just
serve.py,Dockerfile, anddeploy.sh).
-
Have Claude Code build the image and tag it with the current commit SHA, the immutable, traceable tag: "Build the container image and tag it with the short commit SHA and also
:latest." Getting the SHA is git work the agent drives. Verify the result yourself:docker images tasks-app # both tags point at one image; note the SHAThat
:<sha>tag is the unit of deploy. Everything downstream refers to this exact image.
-
Read
lab/deploy.shyourself before running it. It does the five steps: stops any runningtasks-appcontainer, starts the new image with runtime config injected as env vars (Module 17, note theAPP_VERSIONand the absence of any secret baked into the image), polls/healthuntil green, and on failure rolls back to the previous tag it recorded.Now direct Claude Code to run the deploy against the SHA you just built: "Run
deploy.shfor the current commit SHA and report whether it came up healthy." The agent makes the script executable and runs it. Verify the deploy yourself:curl localhost:8000/health # now reports the SHA you deployedAsk the agent to commit a trivial change and deploy again, then read back what it recorded as the rollback target. You now have continuous delivery in miniature: one command turns a commit into a running, version-tagged service.
-
Now prove the net works. The service honors a
BREAK=1env var that makes/healthreturn500, a stand-in for "this build starts but is actually broken." First have the agent deploy a healthy version so there's a known-good to fall back to, then trigger the broken one yourself so you watch it happen:./deploy.sh # healthy baseline (defaults to the current commit SHA) BREAK=1 ./deploy.sh # same image, but the new instance fails its health check
The script starts the "new" version, the health check fails, and it automatically stops the broken instance and brings the previous good one back up. Confirm you're still serving:
curl localhost:8000/health # ok, the bad deploy reverted itselfThat automatic reversal, not the build and not the run, is the part that makes auto-deploy something you can sleep through.
-
Open
lab/cd-starter.ymland compare it to the Module 14ci-starter.yml. It's the same pipeline with stages appended: the lint/test/scan gates run first (unchanged), and onlyon: pushtomain(a merge) do the build-publish-deploy stages run. Trace theneeds:/dependency chain that makes deploy run only after the checks pass. -
Find the one line that is the delivery-vs-deployment switch: the deploy-to-prod step gated behind a manual approval (
environment:with a required reviewer, commented in the file). Decide, for thetasks-app, which side you'd choose and why, and ask Claude Code to make the case for the other choice. The goal isn't a "right" answer; it's being able to articulate the risk posture either way.
A note on running the full pipeline: actually executing
cd-starter.ymlend to end needs a forge with a container registry and a deploy target wired up; that's environment-specific and partly Module 19's territory (the runners and compute underneath). Parts A–D give you the deploy logic runnable today on your own machine; the YAML shows how it slots into the automated pipeline you already started in Module 14.
Be honest about the edges: this is where teams get burned.
- The deploy is only as safe as the gates in front of it. Continuous deployment with weak tests and no review isn't "moving fast," it's an automated mistake-shipping machine. If you haven't done the Module 10/14/15 work, do delivery (human on the button), not deployment. Auto-deploy is a reward you earn by trusting your gates, not a default you turn on.
-
Health checks lie. A
200from/healthmeans "the process started," not "the feature works." A shallow health check passes while the app returns garbage to users. Make the check meaningful (does it reach its database? can it serve a real request?) and lean on canary/gradual rollout for anything important, but know that no health check replaces real tests and real monitoring. - Rollback isn't free, and some things don't roll back. Reverting the running image is cheap. Reverting a database migration, a sent email, a charged credit card, or a published message is not. Those are forward-only. The cleaner the separation between code deploys and irreversible state changes, the more rollback actually saves you. Don't assume "we can always roll back" covers data.
-
This lab simulates the target. A local
docker runis the deploy logic, not the deploy reality. Real targets add networking, DNS cutover, load balancers, zero-downtime orchestration, and multiple instances. The five steps hold; the operational surface around them is larger. The compute that runs all of this (and why you might run your own) is Module 19. - "Build once" only holds if you actually do. The instant someone rebuilds on the prod box "just to be sure," you've lost the guarantee that prod runs what CI tested. Deploy the artifact CI built. No rebuilds downstream.
You're done when:
- You can state the difference between continuous delivery and continuous deployment in one sentence
(who clicks the prod button) and say which one
tasks-appshould use and why. -
./deploy.shbuilds, tags by commit SHA, runs the container, and reports a healthy deploy you cancurl. - You have watched a bad deploy roll itself back to the previous good version, and the service stayed up.
- You can point at the line in
cd-starter.ymlthat turns delivery into deployment, and explain what gates have to be trustworthy before you'd flip it.
When a deploy is one command, a bad one reverts itself, and you can argue the delivery-vs-deployment call for a given repo, you've closed the merged-to-running gap. Module 19 goes underneath all of this: the runners and compute actually executing your CI/CD, and why you'd own them.
This is expansion-zone material (Module 15+); some specifics drift. Re-check at build/publish time:
- Action/runner versions in
cd-starter.yml(actions/checkout,actions/setup-python, any build/login/push actions); pin to current major versions and confirm they still exist. - Registry login + push syntax: the standard build-and-push action names and auth flow change; verify against current forge docs rather than the comments here.
- Manual-approval mechanism: the way a forge gates a job behind human approval
(GitHub
environmentprotection rules, GitLabwhen: manual, others) shifts in naming/UI. Confirm the delivery-vs-deployment switch still maps to the current feature. - Container runtime commands: confirm
docker/podmanflags used indeploy.sh(run,--health-*,inspect) match current CLI behavior. - Cross-references to Modules 16, 17, and 19 still match those modules' final content.
Continue to: Module 19: Runners, the Compute Behind the Automation ➡
Generated from the ai-workflow-course repo • the model is the cheap, swappable part; the workflow is the durable skill.
Unit 1: Get out of the chat window
- 1 · The Copy-Paste Problem
- 2 · Version Control as a Safety Net
- 3 · Version Control for Words, Not Just Code
- 4 · Getting the AI Out of the Browser
- 5 · Commit the AI's Config, Not Just the Code
- 6 · Branches as Sandboxes for Experiments
- 7 · Worktrees for Running Agents in Parallel
Unit 2: Make it shareable, reviewable, recoverable
- 8 · Remotes and Hosting (GitHub, the Alternatives, and Owning Your Repo)
- 9 · Issues and the Task Layer
- 10 · Reviewing Code You Didn't Write
- 11 · Collaboration: Humans and Agents on One Repo
- 12 · When It Goes Wrong: Revert, Reset, and Recovery
Unit 3: Automate the checking and shipping
- 13 · Testing in the AI Era
- 14 · Continuous Integration
- 15 · Security Scanning for AI-Generated Code
- 16 · Containers and Reproducible Environments
- 17 · Secrets, Config, and Environments
- 18 · Continuous Delivery and Deployment
- 19 · Runners, the Compute Behind the Automation
Unit 4: Extend the AI into your systems
- 20 · MCP Servers, Giving the AI Hands
- 21 · Skills: Teaching the AI Your Playbook
- 22 · Securing Third-Party MCP Servers and Skills
- 23 · Working with Existing Codebases
Unit 5: AI in the Loop
- 24 · Assistive Agents (AI Review and Issue Triage)
- 25 · Module 25. Autonomous Agents: Issue-to-PR and Self-Healing CI
- 26 · Orchestrating Multiple Agents
- 27 · Module 27. Evals: Trusting an Agent That Acts Without You
Finale