Skip to content

Run tests on a throwaway RAM-backed Postgres#201

Merged
passcod merged 6 commits into
mainfrom
ramdisk-test-pg
May 29, 2026
Merged

Run tests on a throwaway RAM-backed Postgres#201
passcod merged 6 commits into
mainfrom
ramdisk-test-pg

Conversation

@passcod
Copy link
Copy Markdown
Member

@passcod passcod commented May 29, 2026

🤖 Each test creates and drops its own database (and runs every migration), and nextest runs many in parallel. Against a disk-backed Postgres the resulting CREATE DATABASE/DROP DATABASE fsync storm saturates disk I/O and can make the whole machine grind to a halt — hence the habit of prefixing local runs with nice.

This adds scripts/ramdisk-pg.sh, which spins up a disposable Postgres on tmpfs (/dev/shm) with fsync/synchronous_commit/full_page_writes off, points DATABASE_URL at it, runs the given command, then tears it all down. Nothing touches a physical disk, so there's no I/O grind and runs are dramatically faster (a full package run went from ~54s to ~2s here). It reuses the installed initdb/pg_ctl, so the server version matches the system Postgres with no container or image to manage.

What changed

  • scripts/ramdisk-pg.sh: the harness. Picks a free port, initdbs a throwaway cluster on tmpfs, starts it with durability off, creates the role+db, exports DATABASE_URL (and CANOPY_E2E_ADMIN_DATABASE_URL), runs the wrapped command, and cleans up on exit. Finds the Postgres server tools on Arch, Debian/Ubuntu, Homebrew, and Postgres.app.
  • just test (and test-package/test-name/test-verbose/test-e2e) now run through it by default. just test takes nextest args, so just test, just test -p database, and just test <name> all work.
  • just test-system [args] is the escape hatch to run against $DATABASE_URL instead — for inspecting the DB afterwards, or where initdb isn't available.
  • CI now runs the just recipes rather than inlining cargo nextest plus its own Postgres-service setup, so it exercises the same path developers run locally.

Notes

  • The footprint is small: ~40MB base cluster + ~8MB per concurrently-live test DB, peak well under ~250MB, so RAM isn't a real constraint.
  • macOS has no /dev/shm and no default tmpfs, so it falls back to disk there — still fast (fsync is off), just not RAM-backed unless you point CANOPY_TEST_PG_DIR at a real ramdisk.
  • Needs the Postgres server tools (initdb/pg_ctl), not just the psql client.

passcod and others added 4 commits May 29, 2026 18:13
Each test creates/drops its own database and runs every migration, with
nextest running many in parallel. Against a disk-backed cluster the
CREATE/DROP DATABASE fsync storm saturates disk I/O and makes the whole
machine unresponsive — hence the habit of prefixing runs with nice.

scripts/ramdisk-pg.sh spins up a throwaway Postgres on tmpfs (/dev/shm)
with fsync/synchronous_commit/full_page_writes off, points DATABASE_URL
at it, runs the given command, then tears down. It reuses the installed
initdb/pg_ctl so the server version matches the system Postgres with no
container or image to manage. On a full package this took a ~54s run down
to ~2s with no I/O grind.

Wired up as opt-in just recipes (test-fast passes args through to
nextest; fast wraps arbitrary commands); default just test is unchanged
for machines without spare RAM.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add Homebrew (opt/postgresql@NN) and Postgres.app bundle paths to the
fallback so the script finds initdb/pg_ctl on macOS. macOS has no /dev/shm
and no default tmpfs, so it lands on disk there — but fsync is off, which
is what removes the I/O grind, so it stays fast. Reword the fallback notice
accordingly and point at hdiutil for a real ramdisk via CANOPY_TEST_PG_DIR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Flip just test / test-package / test-name / test-verbose / test-e2e to run
through scripts/ramdisk-pg.sh, so the throwaway tmpfs + fsync-off cluster is
the default and there's no disk-fsync grind out of the box. The per-test
database footprint is tiny (~40MB base + ~8MB per concurrently-live test DB,
peak well under ~250MB), so RAM isn't a real constraint.

just test now takes nextest args (just test -p database, just test <name>).
Add test-system as the escape hatch to run against $DATABASE_URL instead —
for inspecting the DB afterwards or where initdb/pg_ctl aren't available.

CI is unchanged: it calls cargo nextest directly against its own Postgres
service, not the just recipes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI was inlining 'cargo nextest run' plus its own systemctl start postgresql
+ role/db creation, which diverged from 'just test' once the recipe started
wrapping nextest in scripts/ramdisk-pg.sh. Point CI at the recipes instead so
it exercises the exact path developers run locally.

- Test job: install just, drop the Postgres-service setup, run 'just test'.
  The wrapper builds its own throwaway tmpfs cluster from the runner's
  preinstalled initdb/pg_ctl, so no system Postgres is needed.
- Playwright job: drop the Postgres-service setup and wrap the test run with
  scripts/ramdisk-pg.sh, which exports CANOPY_E2E_ADMIN_DATABASE_URL at the
  throwaway cluster for the fixture's per-worker databases. Its explicit
  build/cache steps stay (CI caching concerns, not test-runner divergence).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@passcod passcod enabled auto-merge May 29, 2026 06:30
passcod and others added 2 commits May 29, 2026 18:44
Now that tests are fast (RAM-backed), there's little value in stopping at
the first failure — running the whole suite surfaces every failure in one
go. Bake --no-fail-fast into the test recipes (test, test-package,
test-name, test-verbose, test-system); CI inherits it via just test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The private-server tests serve the embedded React SPA through rust-embed,
so a client-side route like GET /status only returns 200 when
private-web/dist/ exists. Routing CI through just test set
SKIP_FRONTEND_BUILD=1, so build.rs no longer built it and a fresh checkout
had an empty dist/, 404ing that test. (Dev machines have a dist/ from
working on the frontend, which masked it locally.)

Add a Node setup + npm ci + npm run build step to the test job so dist/ is
populated before just test runs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@passcod passcod added this pull request to the merge queue May 29, 2026
Merged via the queue into main with commit 63ea699 May 29, 2026
3 checks passed
@passcod passcod deleted the ramdisk-test-pg branch May 29, 2026 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant