test(e2e): cover OTP expiry without 10-minute wait#122
Conversation
🦋 Changeset detectedLatest commit: dacf1d2 The changes in this PR will be included in the next version bump. Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
🚅 Deployed to the ePDS-pr-122 environment in ePDS
|
📝 WalkthroughWalkthroughAdds test-only internal HTTP endpoints to auth-service gated by an internal secret, centralizes timing-safe internal-secret verification in the shared package, and extends e2e tests and CI to conditionally run OTP-expiry scenarios using those hooks. Changes
Sequence DiagramsequenceDiagram
actor E2ETest as E2E Test Runner
participant AuthService as Auth Service
participant DB as SQLite DB
participant Client as Browser/Client
E2ETest->>AuthService: POST /_internal/test/expire-otp (x-internal-secret)
activate AuthService
AuthService->>AuthService: verifyInternalSecret(header)
AuthService->>DB: UPDATE verification SET expiresAt = now - 60min WHERE identifier = ...
DB-->>AuthService: { updated: N }
AuthService-->>E2ETest: 200 { updated: N }
deactivate AuthService
E2ETest->>Client: simulate user (submit aged OTP)
Client->>AuthService: POST /auth/verify-otp
AuthService->>DB: SELECT verification row
DB-->>AuthService: row is expired
AuthService-->>Client: 400 OTP expired (UI error shown)
E2ETest->>Client: Click resend OTP
Client->>AuthService: POST /auth/resend-otp
AuthService->>DB: INSERT new OTP verification row
DB-->>AuthService: new OTP created
AuthService-->>Client: 200 OTP sent
E2ETest->>Client: retrieve OTP from mailbox, submit
Client->>AuthService: POST /auth/verify-otp
AuthService->>DB: validate OTP (valid)
DB-->>AuthService: valid
AuthService-->>Client: 200 Auth complete
Client-->>E2ETest: Access token received
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
Coverage Report for CI Build 25169036076Coverage increased (+0.8%) to 49.371%Details
Uncovered Changes
Coverage RegressionsNo coverage regressions found. Coverage Stats
💛 - Coveralls |
b6be283 to
ec0f734
Compare
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🔴
packages/auth-service/src/index.ts:89-96— The newPOST /_internal/test/expire-otproute is mounted after the globalcsrfProtectionmiddleware inpackages/auth-service/src/index.ts(line 59 vs line 93), so the e2e step'sfetch()— which sends onlyContent-Typeandx-internal-secret, noepds_csrfcookie orx-csrf-token— will be rejected with a blanket 403CSRF validation failedbefore reaching the handler. The@otp-expiryscenario will fail at runtime on every deployment withEPDS_TEST_HOOKS=1. Fix: mountcreateTestHooksRouterbeforecsrfProtection(mirroring how/api/auth/*is registered first), or exempt/_internal/test/*from the CSRF check.Extended reasoning...
The bug
csrfProtection(packages/auth-service/src/middleware/csrf.ts:28-37) rejects all POST requests with a blanket403 'CSRF validation failed'whenever either theepds_csrfcookie or a matchingx-csrf-tokenheader (orcsrfbody field) is missing. It has no path-based exemption.The new test-hooks router is registered after this middleware:
// packages/auth-service/src/index.ts app.use(csrfProtection(config.csrfSecret)) // line 59 // ... lots of route mounts ... if (process.env.EPDS_TEST_HOOKS === '1') { app.use(createTestHooksRouter(config.dbLocation)) // line 93 }
Better-auth's
/api/auth/*endpoints escape this only becauseapp.all('/api/auth/*', toNodeHandler(...))is registered at line 32, beforecsrfProtection. The new/_internal/test/*router does not get that treatment.The triggering call site
e2e/step-definitions/auth.steps.ts(theWhen the OTP for the test email is forced to expire on the serverstep):const res = await fetch(url, { method: 'POST', headers: { 'Content-Type': 'application/json', 'x-internal-secret': testEnv.internalSecret, }, body: JSON.stringify({ email: this.testEmail, type: 'sign-in' }), })
This is a Node-side
fetch()call with no preceding GET to the auth service. There is therefore noepds_csrfcookie and nox-csrf-tokenheader — and crucially, even if Playwright's browser context held the cookie, Node'sfetch()does not share that cookie jar.Step-by-step proof
@otp-expiryscenario boots, completes OAuth → email submit → OTP email arrives.- Step
'the OTP for the test email is forced to expire on the server'callsfetch(POST authUrl + '/_internal/test/expire-otp')with onlyContent-Type+x-internal-secret. - Express dispatches the request through middlewares in registration order.
cookieParserruns (noepds_csrfcookie).express.json()parses the body (nocsrffield). csrfProtectionruns at index.ts:59. Becausereq.method === 'POST'andreq.cookies[CSRF_COOKIE]isundefined, the check at csrf.ts:34 (if (!cookieToken || !submittedToken)) short-circuits withres.status(403).json({ error: 'CSRF validation failed' })andreturn—next()is never called.- The test-hooks router never runs.
verifyInternalSecretis never invoked. - The step's
if (!res.ok)branch fires and throws:expire-otp hook failed: 403 ... CSRF validation failed. - The scenario fails. Subsequent assertions (
OTP expirederror → resend → fresh OTP) never execute.
Why existing code doesn't prevent this
The unit suite (
test-hooks.test.ts) constructs a bareexpress()and mounts only the test-hooks router —csrfProtectionis never registered, so the unit tests pass without exercising the global middleware stack. This is whypnpm testreports green and the bug ships.The PR description's last checkbox —
Local e2e: pnpm test:e2e:headless --tags @otp-expiry— is unchecked, which is consistent with the integrated path never having been run end-to-end.Fixes
Two clean options:
- Mount the test-hooks router before
csrfProtection, mirroring better-auth:if (process.env.EPDS_TEST_HOOKS === '1') { app.use(createTestHooksRouter(config.dbLocation)) } // ... app.use(csrfProtection(config.csrfSecret))
- Add an explicit path bypass at the top of
csrfProtection:if (req.path.startsWith('/_internal/test/')) return next()
Option 1 is consistent with how
/api/auth/*already escapes CSRF and keepscsrfProtectionsimple. Authentication is already enforced byx-internal-secret(timing-safe) and by theEPDS_TEST_HOOKSmount gate, so CSRF protection on these routes adds nothing — the endpoint is not browser-reachable anyway.
ec0f734 to
a9555d5
Compare
a9555d5 to
6bf825a
Compare
6bf825a to
b4b737b
Compare
b4b737b to
390c129
Compare
390c129 to
6443410
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/auth-service/src/__tests__/test-hooks.test.ts`:
- Around line 151-162: The cleanup try/catch in the afterEach block that calls
fs.unlinkSync(dbPath) (and the similar block later) silently swallows all
errors; change the empty catch to catch (err) and if err.code === 'ENOENT'
ignore it, otherwise rethrow the error (or at minimum log it via the test logger
at debug level) so real teardown failures aren’t hidden; update both occurrences
(the afterEach cleanup and the later identical block) to follow this pattern and
reference the fs.unlinkSync(dbPath) calls and their surrounding
afterEach/cleanup blocks when making the change.
- Around line 52-68: postHook currently starts an ephemeral server with
app.listen(0) and calls server.unref() but never closes it, and it also swallows
JSON parse errors; fix by storing the server returned by app.listen in a
variable inside postHook and ensuring you call server.close() (awaiting a
Promise that resolves/rejects on close) after the fetch completes (or in a
finally block), modify the JSON parsing from (await res.json().catch(() =>
({}))) to explicitly try { await res.json() } catch (err) { /* log or rethrow
*/; json = {} } so errors are not silently ignored, and update the afterEach
file-deletion catch to handle specific error codes (e.g., ignore ENOENT) or log
other errors instead of swallowing them; reference postHook, postExpire,
postExpireAuthFlow and the afterEach cleanup to locate the changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4ca4e038-cb41-4481-abb5-2053da7b55f1
📒 Files selected for processing (16)
.env.example.github/workflows/e2e-tests.ymldocker-compose.ymle2e/.env.examplee2e/cucumber.mjse2e/step-definitions/auth.steps.tse2e/support/env.tsfeatures/passwordless-authentication.featurepackages/auth-service/.env.examplepackages/auth-service/src/__tests__/test-hooks.test.tspackages/auth-service/src/index.tspackages/auth-service/src/routes/test-hooks.tspackages/pds-core/src/index.tspackages/shared/src/__tests__/crypto.test.tspackages/shared/src/crypto.tspackages/shared/src/index.ts
6443410 to
02cacab
Compare
This comment has been minimized.
This comment has been minimized.
02cacab to
325a59e
Compare
325a59e to
18bb640
Compare
A user who takes >10 minutes to enter their OTP and clicks Resend would land on /auth/complete with "Authentication session expired", because the auth_flow row, the epds_auth_flow cookie, and the better-auth OTP verification row all shared the same 10-minute TTL. Resend successfully issued a new OTP, but by then the auth_flow + cookie had already been purged so /auth/complete had no flow_id to thread. Fix: decouple auth_flow lifetime from OTP lifetime. The auth_flow row and cookie now live for 60 minutes (lifted to lib/auth-flow.ts so login-page.ts and recovery.ts share a single constant), while OTP expiry stays at 10 min inside better-auth. A 10-min OTP timeout + Resend now keeps the same OAuth ticket alive end-to-end. Test infra: - e2e scenario @otp-expiry in passwordless-authentication.feature reproduces the user-reported flow without a wall-clock wait. - Test-only auth-service hooks POST /_internal/test/expire-otp and POST /_internal/test/expire-auth-flow (gated by EPDS_TEST_HOOKS=1, blocked under NODE_ENV=production, authenticated via EPDS_INTERNAL_SECRET) backdate the corresponding rows so the scenario runs in seconds. - Lifted verifyInternalSecret into @certified-app/shared, removing the duplicate copy in pds-core/src/index.ts. - Mounted the test-hooks router before csrfProtection because it is called by a non-browser test runner that authenticates via x-internal-secret rather than CSRF tokens. - Added .github/workflows/e2e-tests.yml wiring for E2E_INTERNAL_SECRET. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
18bb640 to
dacf1d2
Compare
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
e2e/step-definitions/auth.steps.ts (1)
368-387: ⚡ Quick winUpdate the OTP-expiry block comment to match current behavior.
This comment says the scenario expires
auth_flowand cookie too, but the actual step intentionally expires only the OTP row. Keeping this stale rationale risks future regressions during maintenance.Proposed comment cleanup
-// Simulates the user taking longer than 10 minutes between requesting -// the OTP and entering it. To reproduce the real-world failure mode -// faithfully (auth-service issue: even after Resend, /auth/complete -// returns "Authentication session expired") we have to age out THREE -// things in lockstep, all of which expire after 10 minutes in -// production: -// -// 1. The better-auth verification row (the OTP itself) — backdated -// via POST /_internal/test/expire-otp. -// 2. The auth_flow row in the auth-service SQLite — backdated via -// POST /_internal/test/expire-auth-flow. -// 3. The epds_auth_flow cookie in the browser — Playwright doesn't -// let us forge an expiry timestamp on an existing cookie, so we -// delete it outright. Functionally equivalent for the bug we're -// reproducing: the browser presents no auth_flow cookie to -// /auth/complete. +// Simulates the user taking longer than 10 minutes between requesting +// the OTP and entering it by backdating only the better-auth verification +// row via POST /_internal/test/expire-otp. +// +// We intentionally keep auth_flow row + epds_auth_flow cookie alive +// (their TTL is longer) so resend can complete the original OAuth flow.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@e2e/step-definitions/auth.steps.ts` around lines 368 - 387, The block comment above the OTP-expiry test section is stale: update it to reflect that the test only backdates the better-auth verification row via POST /_internal/test/expire-otp (it does NOT backdate the auth_flow row or delete the epds_auth_flow cookie); remove or edit the lines claiming the test also expires the auth_flow SQLite row and the browser cookie, and clarify that only the OTP row is aged out to reproduce the current behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@e2e/step-definitions/auth.steps.ts`:
- Around line 368-387: The block comment above the OTP-expiry test section is
stale: update it to reflect that the test only backdates the better-auth
verification row via POST /_internal/test/expire-otp (it does NOT backdate the
auth_flow row or delete the epds_auth_flow cookie); remove or edit the lines
claiming the test also expires the auth_flow SQLite row and the browser cookie,
and clarify that only the OTP row is aged out to reproduce the current behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9d2f193b-307b-4680-9cea-9b45ba036be2
📒 Files selected for processing (21)
.changeset/auth-flow-ttl-decoupled-from-otp-ttl.md.env.example.github/workflows/e2e-tests.ymldocker-compose.ymle2e/.env.examplee2e/cucumber.mjse2e/step-definitions/auth.steps.tse2e/support/env.tsfeatures/passwordless-authentication.featurepackages/auth-service/.env.examplepackages/auth-service/src/__tests__/login-page.test.tspackages/auth-service/src/__tests__/test-hooks.test.tspackages/auth-service/src/index.tspackages/auth-service/src/lib/auth-flow.tspackages/auth-service/src/routes/login-page.tspackages/auth-service/src/routes/recovery.tspackages/auth-service/src/routes/test-hooks.tspackages/pds-core/src/index.tspackages/shared/src/__tests__/crypto.test.tspackages/shared/src/crypto.tspackages/shared/src/index.ts
✅ Files skipped from review due to trivial changes (9)
- packages/shared/src/index.ts
- packages/auth-service/src/tests/login-page.test.ts
- .changeset/auth-flow-ttl-decoupled-from-otp-ttl.md
- e2e/.env.example
- packages/auth-service/src/routes/login-page.ts
- .env.example
- packages/pds-core/src/index.ts
- packages/auth-service/.env.example
- packages/auth-service/src/tests/test-hooks.test.ts
🚧 Files skipped from review as they are similar to previous changes (4)
- e2e/support/env.ts
- e2e/cucumber.mjs
- .github/workflows/e2e-tests.yml
- packages/shared/src/tests/crypto.test.ts



Summary
@email @otp-expiryscenario inpasswordless-authentication.featurethat drives the full OAuth login through to the OTP form, fast-forwards the OTP past its 10-minute expiry, asserts the helpfulOTP expirederror, then verifies the resend path completes the flow normally.POST /_internal/test/expire-otp(gated byEPDS_TEST_HOOKS=1, authenticated viaEPDS_INTERNAL_SECRET) that backdates the better-authverification.expiresAtrow so the test runs in seconds rather than minutes.EPDS_TEST_HOOKS=1is set together withNODE_ENV=production, so a misconfigured production deploy fails fast rather than silently exposing the endpoint.EPDS_TEST_HOOKS=1is enabled indocker-compose.yml. For Railway, enable it onpr-baseonly — production should leave it unset.Test plan
pnpm format:checkcleanpnpm lintcleanpnpm typecheckcleanpnpm test— 799 unit tests pass (8 new for the test-hooks router: prod block, missing/wrong secret, unknown type, missing email, happy path, no-match, lowercase normalisation)pnpm test:e2e:headless --tags @otp-expiryagainst the docker-compose stack🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests
Chores