Skip to content

feat: unified outpost migrate CLI (#675)#816

Merged
alexluong merged 5 commits intomainfrom
feat/unified-migration-cli
Apr 13, 2026
Merged

feat: unified outpost migrate CLI (#675)#816
alexluong merged 5 commits intomainfrom
feat/unified-migration-cli

Conversation

@alexluong
Copy link
Copy Markdown
Collaborator

@alexluong alexluong commented Apr 10, 2026

Summary

Implements the unified migration experience described in #675. Replaces the two separate migration systems (SQL auto-run at startup + outpost-migrate-redis standalone CLI) with a single outpost migrate command and an explicit startup gate that refuses to boot when migrations are pending.

What ships:

  • New internal/migrator/coordinator package — wraps both the SQL migrator and the Redis migration runner behind a single API (List, Plan, Apply, Verify, PendingSummary, Unlock). Redis state tracking reuses the same hash-key layout as migratorredis.Runner, so existing Redis state carries over seamlessly.
  • New outpost migrate subcommand tree — real list / plan / apply / verify / unlock commands backed by the coordinator. Replaces the old passthrough that delegated to the standalone Redis CLI.
  • Startup gateapp.PreRun() no longer auto-applies migrations. Instead it calls the coordinator and refuses to start if anything is pending, with a clear error telling the operator exactly what to run.
  • Removal of the standalone migration binariescmd/migratesql and cmd/outpost-migrate-redis are deleted along with every reference in the Makefile, goreleaser, Dockerfiles, and container entrypoints. The unified CLI plus the startup gate cover everything they did.

Breaking change

Deployment workflows that relied on auto-migration at startup need to move to an explicit two-step flow:

outpost migrate apply --yes   # run explicitly before deploy
outpost serve                 # refuses to start if anything is still pending

Release notes should flag this and add it to the upgrade guide.

Expected UX — what reviewers should see when testing this locally

The flow below is what an operator gets when starting from a v0.15.0-seeded deployment and upgrading to this branch. Every snippet is real output from a local run.

1. Help output

$ outpost migrate --help
NAME:
   outpost migrate - Database migration tools

USAGE:
   outpost migrate [command [command options]]

COMMANDS:
   list    List all migrations (SQL + Redis) with their status
   plan    Show what migrations would be applied
   apply   Apply all pending migrations (SQL then Redis)
   verify  Verify that migrations were applied correctly
   unlock  Force clear the Redis migration lock (use with caution)

OPTIONS:
   --config string, -c string  Path to config file [$CONFIG]
   --verbose                   Enable verbose logging

2. Startup gate fires when anything is pending

Running the server directly without applying pending migrations:

$ outpost serve
...
{"level":"error","msg":"pending migrations detected, refusing to start","sql_pending":3,"redis_pending":0}
pending migrations detected (3 SQL, 0 Redis) — run 'outpost migrate apply' before starting the server
exit status 1

3. outpost migrate list — unified view of both subsystems

$ outpost migrate list

SQL Migrations:
  [applied       ] sql/000001  init
  [applied       ] sql/000002  delivery_response
  [applied       ] sql/000003  event_delivery_index
  [applied       ] sql/000004  delivery_manual_attempt
  [applied       ] sql/000005  denormalize_attempts
  [applied       ] sql/000006  data_jsonb_to_text
  [pending       ] sql/000007  matched_destination_ids
  [pending       ] sql/000008  attempt_destination_type
  [pending       ] sql/000009  response_data_jsonb_to_text

Redis Migrations:
  [applied       ] redis/001_hash_tags   Migrate from legacy format (tenant:*) to hash-tagged format (tenant:{id}:*) for Redis Cluster support
  [applied       ] redis/002_timestamps  Convert timestamp fields from RFC3339 strings to Unix millisecond timestamps for timezone-agnostic sorting
  [applied       ] redis/003_entity      Add entity field to tenant and destination records for RediSearch filtering

Summary: 3 SQL pending, 0 Redis pending

Rows the coordinator knows aren't applicable (e.g. a Redis migration whose IsApplicable() returns false under the current config) render as [not_applicable] with a reason so list and the startup gate never disagree.

4. outpost migrate plan — what apply would change

$ outpost migrate plan

Planning migrations...

SQL Migrations (3 pending, v6 → v9):
  - sql/000007  matched_destination_ids
  - sql/000008  attempt_destination_type
  - sql/000009  response_data_jsonb_to_text

Redis Migrations: up to date

When Redis does have pending work, plan calls each migration's own Plan() and reports per-migration scope (tenants_need_migration, destinations_need_migration, total_need_migration).

5. outpost migrate apply --yes — applies SQL then Redis

$ outpost migrate apply --yes

Planning migrations...

SQL Migrations (3 pending, v6 → v9):
  - sql/000007  matched_destination_ids
  - sql/000008  attempt_destination_type
  - sql/000009  response_data_jsonb_to_text

Redis Migrations: up to date

{"level":"info","msg":"sql migrations applied","version":9,"count":3}

All migrations applied successfully.

apply always runs SQL first (schema must be ready before Redis data transformations) and then iterates Redis migrations in version order. Each migration goes through the standard plan → lock → apply → mark-applied loop the old Redis CLI used, so it's a drop-in replacement for the old outpost-migrate-redis apply --all --yes flow.

Flags: --yes (skip confirmation), --sql-only, --redis-only.

6. outpost migrate verify — confirms post-apply state

$ outpost migrate verify

Verifying migrations...

SQL: current=9 latest=9 [OK]
Redis:
  redis/001_hash_tags:   OK (0/0 checks passed)
  redis/002_timestamps:  OK (0/0 checks passed)
  redis/003_entity:      OK (0/0 checks passed)

All migrations verified successfully.

7. Server starts cleanly after apply

$ outpost serve
... healthy in ~2s

And the API is fully functional — tenants / destinations / attempts created under the old version are still readable (including response_data that survived the JSONB → TEXT migration in 000009), and freshly published events deliver end-to-end.

8. outpost migrate unlock

Interactive by default; prompts before clearing the Redis migration lock. --yes skips the prompt for scripted recovery from a dead migration process.

What happened to init and cleanup?

The old outpost-migrate-redis binary had two subcommands that are not in the unified CLI. Here's why that's intentional:

init — split into two separate concerns:

  • init --current (preflight check) was used by the production container entrypoint to fail fast before starting the server if migrations were pending. This is now handled by the startup gate in app.PreRun() — the server itself refuses to start and prints a clear error. build/entrypoint.sh was simplified to just exec outpost serve.
  • Fresh-install marking (the other half of init — detecting an empty Redis and marking all migrations as applied without running them so a brand-new deployment doesn't waste time scanning empty keys) is not implemented in the coordinator. It's not a correctness issue: on a fresh Redis the coordinator iterates each migration and calls its Plan()/Apply(), all of which scan for records needing work, find none, and return. Slightly wasteful on first boot of a brand-new cluster; easy to add later as a fast-path.

cleanup — niche use case, deferred:

Of the three current Redis migrations, only 001_hash_tags leaves legacy keys behind after running (it renames tenant:*tenant:{id}:*). The other two transform values in place, so there's nothing to delete. And 001_hash_tags only has work to do on deployments that don't set DEPLOYMENT_ID, which is a shrinking set.

So cleanup was mostly dead code in the old CLI too — a safety net for one specific upgrade path. If a future migration genuinely needs post-apply key deletion, it's a small, self-contained addition to the coordinator and the CLI.

Test plan

  • go build ./... passes
  • go test -short ./... passes, including new unit tests for the SQL introspection helpers and miniredis-backed coordinator tests
  • outpost migrate --help shows the new subcommand tree
  • End-to-end manual QA of the v0.15.0 → current branch upgrade path (see comment for the full scenario and verification steps)
  • outpost serve fails cleanly when migrations are pending and starts cleanly after outpost migrate apply

Open questions for reviewers

  • Should the startup gate ship behind an env var (e.g. OUTPOST_STRICT_MIGRATIONS) for one release to give operators an escape hatch before becoming the default?
  • Lock retry logic was removed along with auto-migration (it only mattered for concurrent startup races). If operators run outpost migrate apply concurrently from multiple nodes the second will fail fast. Acceptable?

🤖 Generated with Claude Code

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
outpost-docs Ready Ready Preview, Comment Apr 10, 2026 8:48pm
outpost-website Ready Ready Preview, Comment Apr 10, 2026 8:48pm

Request Review

Add ListMigrations, LatestVersion, and PendingCount methods on the SQL
Migrator so a future unified migration CLI can enumerate available
migrations and compute pending counts without going through the
golang-migrate driver.

Stores the raw embedded filesystem and subdirectory on the Migrator at
construction time, parses NNNNNN_name.up.sql filenames, and correlates
the result with the current schema version.

Pure additive change — no existing callers are affected.

Refs #675

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
alexluong and others added 4 commits April 11, 2026 03:47
Introduce internal/migrator/coordinator, a thin orchestration layer over
the existing SQL migrator and the migratorredis.Migration interface that
exposes a single API for listing, planning, applying, and verifying
migrations across both subsystems.

The coordinator is designed to be driven from both a CLI (next phase)
and from app startup checks (phase after that). Redis state tracking
uses the same hash key layout and lock key as migratorredis.Runner and
cmd/outpost-migrate-redis so existing state carries over seamlessly.

Covered by miniredis-backed unit tests for the Redis path. SQL paths
delegate to the introspection helpers added in the previous commit.

Refs #675

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stop delegating 'outpost migrate' to the outpost-migrate-redis binary
and implement a real subcommand tree backed by the unified coordinator:

  outpost migrate list      — list all SQL + Redis migrations
  outpost migrate plan      — show what would be applied
  outpost migrate apply     — apply pending migrations (SQL then Redis)
  outpost migrate verify    — verify applied migrations
  outpost migrate unlock    — clear a stuck Redis migration lock

Each subcommand constructs a coordinator from the usual Outpost config
and delegates the actual work. Output rendering lives in a separate
file so the CLI layer stays thin.

This phase does not yet change startup behaviour — auto-migration in
app.PreRun() still runs. The old outpost-migrate-redis and migratesql
binaries also keep working and will be deprecated in a later commit.

Refs #675

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove auto-migration from the startup sequence and replace it with a
unified pending-migration check backed by the migration coordinator.

Before: runMigration() applied SQL migrations and runRedisMigrations()
applied auto-runnable Redis migrations as side effects of app.PreRun().
Operators had no visibility into migration timing and non-auto-runnable
Redis migrations would fail startup with a cryptic error.

After: PreRun connects to Redis, then asks the coordinator whether any
SQL or Redis migrations are pending. If any are, startup fails fast
with a clear message telling the operator to run
'outpost migrate apply' first.

This is the breaking change flagged in issue #675. Deployment
workflows that previously relied on auto-migration now need an
explicit 'outpost migrate apply' step before 'outpost serve'.

Deletes the obsolete lock-retry logic and its test — retries only
mattered for concurrent auto-runs at startup, which no longer exist.

Refs #675

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Delete cmd/migratesql and cmd/outpost-migrate-redis along with every
reference in build, Makefile, docs, and container entrypoints. The
unified 'outpost migrate' command and the startup gate added in the
previous commits cover every workflow these binaries served:

  migratesql up             -> outpost migrate apply
  outpost-migrate-redis list  -> outpost migrate list
  outpost-migrate-redis plan  -> outpost migrate plan
  outpost-migrate-redis apply -> outpost migrate apply
  outpost-migrate-redis verify -> outpost migrate verify
  outpost-migrate-redis unlock -> outpost migrate unlock

The production container entrypoint (build/entrypoint.sh) no longer
needs the explicit 'migrate init --current' preflight — the server
itself refuses to start when migrations are pending, so a single
'exec outpost serve' gives the same behaviour with a clearer error.

The dev entrypoint (build/dev/entrypoint.sh) drops the same preflight
and just execs 'air serve'. Operators run 'make migrate' from the host
to apply pending migrations before starting the container.

goreleaser, Dockerfile.example, and Dockerfile.goreleaser are updated
to stop building and shipping the removed binaries.

Refs #675

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alexluong
Copy link
Copy Markdown
Collaborator Author

Manual QA: v0.15.0 → current branch upgrade

Walked through the realistic upgrade path an operator will take to go from a released Outpost version to the unified migration CLI.

Scenario

  1. Started the hookdeck/outpost:v0.15.0 stack (old auto-migration behavior).
  2. Seeded data through the v0.15 API: tenant, webhook destination, published event, confirmed the delivery attempt landed successfully.
  3. Stopped v0.15.0 and attempted to start the current-branch server — it refused with pending migrations detected (3 SQL, 0 Redis) — run 'outpost migrate apply' before starting the server, exactly as the new startup gate is supposed to.
  4. Ran the new unified CLI end-to-end:
    • outpost migrate list — 6 SQL applied from v0.15, 3 pending (000007/8/9).
    • outpost migrate plan — reported SQL Migrations (3 pending, v6 → v9), Redis "up to date".
    • outpost migrate apply --yes — applied only the 3 SQL migrations, Redis not touched.
    • outpost migrate verifySQL: current=9 latest=9 [OK].
  5. Started the current-branch stack — healthy in ~2s.

Verified

  • v0.15-era tenant, destination, and attempt are still readable under the current branch.
  • The v0.15 attempt's response_data survived the JSONB → TEXT migration (000009) — body still fully deserializable.
  • A freshly published event post-upgrade delivered end-to-end against the same v0.15-era destination within ~10s.
  • Legacy cmd/migratesql and cmd/outpost-migrate-redis directories are gone; no references remain in Makefile, goreleaser, Dockerfiles, or entrypoints.

@alexluong alexluong marked this pull request as ready for review April 10, 2026 21:20
@alexbouchardd
Copy link
Copy Markdown
Contributor

Very good 👍

@alexluong alexluong merged commit c5f5bd2 into main Apr 13, 2026
5 checks passed
@alexluong alexluong deleted the feat/unified-migration-cli branch April 13, 2026 10:40
alexluong added a commit that referenced this pull request Apr 13, 2026
The unified migration CLI (#816) added a startup gate that refuses to
start when migrations are pending. E2e tests create fresh databases but
never applied migrations, causing all tests to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
alexluong added a commit that referenced this pull request Apr 13, 2026
The unified migration CLI (#816) added a startup gate that refuses to
start when migrations are pending. E2e tests create fresh databases but
never applied migrations, causing all tests to fail.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
alexluong added a commit that referenced this pull request Apr 17, 2026
PR #816 added cmd/outpost/migrate.go but air.toml still built only
cmd/outpost/main.go, causing undefined symbol errors in dev.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants