Skip to content

feat(refresh-tokens): rotating refresh tokens with family-based replay defense#14

Merged
wolpert merged 2 commits into
mainfrom
feature/refresh-tokens
May 17, 2026
Merged

feat(refresh-tokens): rotating refresh tokens with family-based replay defense#14
wolpert merged 2 commits into
mainfrom
feature/refresh-tokens

Conversation

@wolpert
Copy link
Copy Markdown
Contributor

@wolpert wolpert commented May 17, 2026

Summary

The third 1.1.0 feature — new pk-auth-refresh-tokens module shipping the rotating refresh-token primitive motif (and most multi-client deployments) currently roll by hand. See ADR 0013.

Service surface. RefreshTokenService.issue / rotate / revokeFamily / revokeAllForUser / listForUser. rotate(presentedWireToken) returns a sealed RotateResult sum: Success(pair, claimsForAccessIssue) | Replayed(familyId, userHandle) | Expired | Unknown | Revoked(reason). The service does NOT call PkAuthJwtIssuer itself — it returns the data the consumer needs to mint an access JWT, keeping the two primitives composable.

Wire format. {refreshId}.{secret} — both halves base64url, no padding. refreshId is 16 random bytes (22 chars), secret is 32 random bytes (43 chars). Only SHA-256(secret) is persisted; the wire token never gets logged. Hash-before-mark-used invariant enforced: a presented refresh-id with the wrong secret returns Unknown, never burns the legitimate row's used_at.

Load-bearing atomic primitive on the SPI. rotateAtomically(parentRefreshId, now, successor) marks the parent used AND inserts the successor as a single atomic operation:

  • JDBI: jdbi.inTransaction(...) wrapping a conditional UPDATE on the parent and an INSERT for the successor.
  • DynamoDB: TransactWriteItems with conditional update on the parent primary item + conditional puts for the successor's primary + user-index + family-index items.
  • In-memory: ConcurrentHashMap.compute(parentId, ...) block.

Family-based replay defense. When rotateAtomically returns false (race lost OR parent flipped used/revoked between read and write), the service calls revokeFamily(familyId, ROTATION_REPLAY) outside the failed-rotation scope so the scorch always commits. Both attacker and legitimate client lose the session — the legit user sees their next refresh fail and re-authenticates. A row's revokedReason = ROTATION_REPLAY maps to RotateResult.Replayed so race losers and replay-after-the-fact callers see a consistent outcome.

Non-negotiable concurrent rotation race test. concurrentRotationExactlyOneSucceedsFamilyRevoked launches 8 threads via CountDownLatch + ExecutorService. Exactly one returns Success; the other 7 return Replayed; the entire family (root + winner's successor) ends up revoked. Passes against:

  • In-memory testkit (drives compute path)
  • Postgres via Testcontainers (drives JDBI transaction path)
  • DynamoDB Local (drives TransactWriteItems path)

Adapter wiring — all three adapters wire the service, the deletion listener, and a POST /auth/refresh HTTP endpoint:

  • Spring Boot: @ConditionalOnBean(RefreshTokenRepository.class) gates the service / handler / controller; PkAuthRefreshIntegrationTest asserts the rotated access JWT carries AuthMethod.REFRESH.
  • Dropwizard: Optional<RefreshHandler> threaded through the Dagger graph (both slim + full components); bundle.run() registers PkAuthRefreshResource only when the host wired a repository.
  • Micronaut: @Requires(beans = RefreshTokenRepository.class) on the service / handler / controller; new PkAuthRefreshControllerTest covers happy-path + replay→401.

Browser SDK. PkAuthClient.refresh(wireToken) returns a typed RefreshResult sum ({ kind: 'success', ... } | { kind: 'failure', reason: 'expired'|'unknown'|'replayed'|'revoked' }) — never throws on 401. Vitest covers all five outcomes plus revoke-reason surfacing.

Cross-cutting. AuthMethod.REFRESH for access tokens minted from a rotation; JwtClaims.forRefresh(...) factory; RefreshTokenServiceDeletionListener auto-registered into UserDeletionService so user-delete revokes refresh families alongside access tokens / credentials / backup codes / OTPs.

Docs. ADR 0013 (full design rationale); operator-guide gains a Token-table cleanup section; threat-model gains a Refresh-token replay defense section; README gains a 1.1.0 features bullet pointing at the new module.

Test plan

  • ./gradlew :pk-auth-testkit:test --tests "InMemoryRefreshTokenRepositoryTest" — 9 scenarios pass (including the concurrent race).
  • ./gradlew :pk-auth-persistence-jdbi:test --tests "JdbiRefreshTokenRepositoryIntegrationTest" — 9 scenarios pass against real Postgres (Testcontainers); concurrent race serialised by Postgres row-locking.
  • ./gradlew :pk-auth-persistence-dynamodb:test --tests "DynamoDbRefreshTokenRepositoryIntegrationTest" — 9 scenarios pass against DynamoDB Local; concurrent race serialised by TransactWriteItems.
  • ./gradlew :pk-auth-spring-boot-starter:test --tests "PkAuthRefreshIntegrationTest" — happy path mints a valid access JWT, replay returns 401 detail="replayed", unknown wire token returns 401 detail="unknown".
  • ./gradlew :pk-auth-micronaut:test --tests "PkAuthRefreshControllerTest" — happy path + replay→401 against the Netty HTTP layer.
  • cd clients/passkeys-browser && pnpm test — 43 tests pass (8 new refresh tests).
  • ./gradlew check — full build green across all 15 modules + 3 example demos. JaCoCo coverage thresholds met (pk-auth-micronaut needed the new refresh integration test to recover from a 62% dip).

Schema migrations

  • Flyway V9 adds refresh_tokens (refresh_id, token_hash, user_handle, audience, device_id, family_id, parent_refresh_id, issued_at, expires_at, used_at, revoked_at, revoked_reason) plus indexes on user_handle, family_id, expires_at. PkAuthJdbiSchema.CURRENT_SCHEMA_VERSION"9".
  • DynamoDB: three new item shapes on PkAuthCore (RT#, RTU#, RTF#) with native ttl set to expiresAt epoch second.

🤖 Generated with Claude Code

wolpert and others added 2 commits May 16, 2026 21:43
…y defense

The third and largest 1.1.0 feature: a new pk-auth-refresh-tokens module
with RefreshTokenService (issue / rotate / revokeFamily /
revokeAllForUser). Wire format {refreshId}.{secret} (base64url),
SHA-256 hash-at-rest, hash-before-mark-used invariant, family-based
replay scorch.

The load-bearing primitive on the SPI is rotateAtomically(parent, now,
successor) — marks the parent used AND inserts the successor as a
single atomic operation. Implementations:

- JDBI: jdbi.inTransaction(...) wrapping a conditional UPDATE on the
  parent and an INSERT for the successor.
- DynamoDB: TransactWriteItems with a conditional update on the parent
  primary item and conditional puts for the successor's primary +
  user-index + family-index items.
- In-memory: ConcurrentHashMap.compute(parentId, ...) block.

Shared RefreshTokenScenarios drives nine parity scenarios across all
three backends, including the non-negotiable concurrent rotation race
(8 threads + CountDownLatch + ExecutorService) — exactly one wins, the
rest see Replayed, the entire family ends up revoked. Passes against
Postgres Testcontainers and DynamoDB Local on every CI run.

Adapter wiring:

- Spring Boot: refresh beans + PkAuthRefreshController behind
  @ConditionalOnBean(RefreshTokenRepository.class). End-to-end
  integration test asserts the rotated access JWT carries
  AuthMethod.REFRESH.
- Dropwizard: Optional<RefreshHandler> threaded through the Dagger
  graph (slim + full components both expose it); bundle.run()
  registers PkAuthRefreshResource only when the host wired a
  RefreshTokenRepository.
- Micronaut: @requires(beans = RefreshTokenRepository.class) on the
  service / handler / controller; refresh integration test alongside
  the existing ceremony test.

Browser SDK: PkAuthClient.refresh(wireToken) returns a typed
RefreshResult sum (success | failure with typed reason), never throws
on 401.

Cross-cutting:

- AuthMethod.REFRESH for access tokens minted from a refresh rotation.
- RefreshTokenServiceDeletionListener auto-registered into
  UserDeletionService so user-delete revokes refresh families.
- ADR 0013 documents the design + invariants.
- Operator guide gains a Token-table cleanup section; threat-model
  gains a Refresh-token replay defense section.
- Flyway V9 ships refresh_tokens; PkAuthJdbiSchema.CURRENT_SCHEMA_VERSION
  → "9".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wolpert wolpert merged commit d80dc0d into main May 17, 2026
2 checks passed
@wolpert wolpert deleted the feature/refresh-tokens branch May 17, 2026 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant