Skip to content

test(integration): Postgres testcontainers + 4 critical-path tests (rate-limit race, idempotency, cascade, session)#137

Merged
MBombeck merged 3 commits intomainfrom
worktree-agent-a65a71dc389e3aa60
May 8, 2026
Merged

test(integration): Postgres testcontainers + 4 critical-path tests (rate-limit race, idempotency, cascade, session)#137
MBombeck merged 3 commits intomainfrom
worktree-agent-a65a71dc389e3aa60

Conversation

@MBombeck
Copy link
Copy Markdown
Owner

@MBombeck MBombeck commented May 8, 2026

Why this matters

Adds a real-Postgres integration test suite (pnpm test:integration) so we have automated regression guards against four classes of bugs that mocked unit tests cannot detect:

  • Rate-limit racessrc/lib/rate-limit.ts relies on a single atomic SQL UPSERT. A future refactor that splits read+write back into two queries would silently break the cap under concurrency. Test fires 6 parallel requests at a 5-cap window and asserts 5 allowed / 1 denied.
  • Idempotency replay-attack contract — mobile retries with the same Idempotency-Key must replay the cached envelope without re-executing the side-effect; the do-not-cache list (401/403/408/429/5xx) must hold so a transient 5xx isn't pinned for 24h. Verified end-to-end against the real idempotency_keys table.
  • GDPR Art. 17 cascade deleteprisma.user.delete() must wipe every personal-data row. Schema declares onDelete: Cascade on every personal table; the test seeds them all and asserts zero orphans, plus that audit/feedback rows survive with userId = null (compliance triage).
  • Session expirygetSession() silently purges stale rows before returning null. Worth proving against a real DB so a typo in the WHERE expiresAt < NOW() branch is caught.

How it's wired

  • pnpm add -D testcontainers @testcontainers/postgresql (v11.14)
  • vitest.integration.config.mtstests/integration/**, 60s/120s timeouts, single-fork so all four files share one container
  • tests/integration/global-setup.ts (new) — vitest globalSetup boots postgres:16-alpine, sets DATABASE_URL, runs prisma migrate deploy once, returns a teardown that stops the container at end-of-run
  • tests/integration/setup.ts — per-test helpers: getPrismaClient() (delegates to the app's src/lib/db singleton) and truncateAllTables() for beforeEach() isolation
  • .github/workflows/integration.yml runs the suite on every PR to main and on pushes to main
  • Default pnpm test continues to exclude tests/integration/** — the unit-test loop stays sub-second

Why globalSetup instead of beforeAll-per-file

Earlier iteration started/stopped a fresh container in each file's beforeAll/afterAll. That works in isolation but the application's Prisma singleton (src/lib/db.ts) caches the connection string at module-load time. Once test file A boots a container, sets DATABASE_URL, and that singleton is built, file A's afterAll stops the container — file B then boots a new container at a new port, but the singleton still points at the dead one. globalSetup keeps one container alive for the whole run, the URL stays stable, and beforeEach truncate gives the per-test isolation.

Test plan

  • pnpm typecheck clean
  • pnpm test — 658/658 unit tests green, no testcontainers pulled in (1.6s)
  • pnpm test:integration10/10 pass in 3.9s (was 6/10 with 4 cross-file failures before the globalSetup fix)
  • pnpm vitest list --config vitest.integration.config.mts — all 10 tests across 4 files discovered
  • CI integration job will run the suite

🤖 Generated with Claude Code

MBombeck added a commit that referenced this pull request May 8, 2026
…om default test scope

Rebase brings PR #137's testcontainers infrastructure on top of the
user-locale migration drift fix landed in #140 — the integration suite
should now bootstrap a fresh Postgres container with all 25 migrations
applied cleanly.

vitest.config.mts also gains a `.claude/worktrees/**` exclude so live
agent worktrees (which mirror `src/` while their parent agents are
running) don't double-execute the project's tests against possibly-stale
copies.
@MBombeck MBombeck force-pushed the worktree-agent-a65a71dc389e3aa60 branch from dda3538 to 599319e Compare May 8, 2026 12:31
MBombeck and others added 3 commits May 8, 2026 15:58
Adds an integration test suite that runs against a real Postgres booted
in-process via testcontainers. Lives under tests/integration/, runs via
`pnpm test:integration`, and is excluded from the default `pnpm test`
loop so the 477-test unit suite stays fast (~1.2s).

Covers four regression-prone behaviours that pure unit tests cannot
validate against a mocked client:

- rate-limit.test.ts — atomic UPSERT under concurrency (5/6 allowed)
  and window-reset semantics for src/lib/rate-limit.ts
- idempotency-replay.test.ts — cached replay with X-Idempotent-Replay,
  expired-row purge, and the do-not-cache contract for 401/403/408/429/5xx
  on top of src/lib/idempotency.ts
- cascade-delete.test.ts — GDPR Art. 17 erasure: user.delete wipes
  every onDelete:Cascade table and SetNulls AuditLog/Feedback
- auth-flow.test.ts — createSession/getSession round-trip plus
  expired-session purge for src/lib/auth/session.ts

Adds .github/workflows/integration.yml triggering on pull_request and
push to main. Local Docker is not running in this worktree environment;
the suite was verified by collecting all 10 tests via `vitest list` and
will execute in CI.

Co-Authored-By: Marc-André Bombeck <mbombeck@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…om default test scope

Rebase brings PR #137's testcontainers infrastructure on top of the
user-locale migration drift fix landed in #140 — the integration suite
should now bootstrap a fresh Postgres container with all 25 migrations
applied cleanly.

vitest.config.mts also gains a `.claude/worktrees/**` exclude so live
agent worktrees (which mirror `src/` while their parent agents are
running) don't double-execute the project's tests against possibly-stale
copies.
…balSetup

The first iteration of the testcontainers suite started/stopped a
fresh container in each test file's beforeAll/afterAll. That works
in isolation but breaks when files share a worker (`pool: forks`,
`fileParallelism: false`, `isolate: false`):

  1. Test file A boots container 1, sets DATABASE_URL.
  2. The application's Prisma singleton in src/lib/db.ts is built
     when test code dynamic-imports @/lib/auth/session — it captures
     container 1's URL.
  3. afterAll stops container 1.
  4. Test file B boots container 2, sets DATABASE_URL to a new port.
  5. Test file B imports application code that resolves to the same
     singleton — still pointing at the dead container 1. Queries fail
     with a generic PrismaClientKnownRequestError.

Fix: move the container lifecycle to vitest's globalSetup. One
container, booted once, torn down at the end of the run. Tests still
get isolation because beforeEach truncates every personal-data table
in dependency-safe CASCADE order.

  - tests/integration/global-setup.ts boots postgres:16-alpine,
    runs `pnpm db:migrate:deploy`, exports a teardown for end-of-run.
  - tests/integration/setup.ts now exports getPrismaClient() (which
    delegates to the application singleton) and truncateAllTables().
    No more startTestDb / stopTestDb.
  - All four test files lose their beforeAll/afterAll boilerplate.
  - vitest.integration.config.mts wires the new globalSetup.
  - truncate list adds refresh_tokens (added in 0025_refresh_tokens).

Result: pnpm test:integration → 10/10 pass in 3.9s (was 6/10 with
4 cross-file failures). pnpm test → 658/658 unchanged.

Co-Authored-By: Marc-André Bombeck <mbombeck@gmail.com>
@MBombeck MBombeck force-pushed the worktree-agent-a65a71dc389e3aa60 branch from 599319e to 0656694 Compare May 8, 2026 14:03
@MBombeck MBombeck merged commit accdb7d into main May 8, 2026
7 checks passed
@MBombeck MBombeck deleted the worktree-agent-a65a71dc389e3aa60 branch May 8, 2026 14:03
MBombeck added a commit that referenced this pull request May 8, 2026
Production at healthlog.bombeck.io has been 503-ing since the v1.4.1
deploys started landing on apps-01 (Coolify). The container boots —
Next.js prints "Ready" and the pg-boss workers run — but never
accepts HTTP on :3000, so the Docker healthcheck fails and Traefik
takes the upstream out of rotation. A manual restart, a Coolify
force-rebuild, and a docker-compose pin to the GHCR :1.4.0 multi-arch
image all failed to bring the site back up — Coolify rebuilds the
image from main HEAD on every deploy regardless of the compose
directives.

This commit resets the working tree to commit 21bd46d (v1.4.0
release). Same content that's been running for self-hosters since
yesterday's tag-and-release. The next Coolify deploy will build
from this tree and produce a healthy container.

The v1.4.1 work is NOT lost:
  - PRs #144, #145, #137, #146, #147, #148, #149, #150 remain in
    git history.
  - Their commits are still tagged (`v1.4.1`), still on the GHCR
    multi-arch image (`ghcr.io/mbombeck/healthlog:1.4.1`), still in
    the GitHub Release notes.
  - Self-hosters who have already pulled the v1.4.1 image keep it.
  - Local development continues from main HEAD with the v1.4.1
    code — the regression only surfaced under the Coolify build
    flow.

Re-applying v1.4.1 to production will need a separate cycle to
reproduce the runtime failure under the Coolify build path. That
work is tracked in docs/ops/v141-followup-issues.md (added back
when the tree is reapplied) and the deploy gating in
.github/workflows/e2e.yml will catch this class of bug going
forward.

No DB migration. No env-var change. No API contract change.
Co-Authored-By: Marc-André Bombeck <mbombeck@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant