feat(auth): JWT verifier with JWKS cache (Phase 1.d.1-PR2)#12
Conversation
Standalone JWT verification module — the request-path middleware
that consumes it lands in PR3. Each verify call decodes the token
header, looks up the `kid` in a TTL-bounded JWKS cache (refetched
on miss or expiry), verifies the signature with the cached key,
and validates `iss` / `aud` / `exp` / `nbf` / `sub` claims.
Module surface (`src/auth.rs`):
- `JwtVerifierConfig { jwks_url, issuer, audience }` — caller-owned
config so the binary, tests, and a future Better Auth shim build
the same way.
- `JwtVerifier::new(config)` for production (builds a default reqwest
client with a 5s timeout) and `JwtVerifier::with_client(config,
client)` for callers who want a custom client (the test harness
uses it to point at a local mock JWKS server; production callers
can use it for mTLS / retry policy / etc.).
- `VerifiedClaims { sub }` — the only field the middleware will
actually consume in PR3.
- `AuthError` enum discriminating each failure path so the
middleware can map them to distinct HTTP statuses (401 for
client errors, 503 for JWKS unreachable).
Cache strategy:
- TTL: 1h, hardcoded to start (configurable later if Better Auth
advertises a different Cache-Control).
- Keyed by `kid`. Cache miss → fetch JWKS once → retry lookup → if
still missing, `KeyNotFound`. No retry loop, so a flood of bad
`kid`s can't DoS the upstream.
- Signature failures on a cached key do NOT refetch — same DoS
rationale. Trade-off: in-place `kid` rotation fails closed;
Better Auth rotates by adding new `kid`s alongside old ones for
a grace window, so this is fine in practice.
Security hardening:
- Algorithm trust source = JWKS, not token header. A token whose
`alg` disagrees with the cached key's `alg` is rejected with
`AlgorithmMismatch` — blocks confused-deputy attempts to flip
RS256 → HS256.
- `sub` claim is required + trimmed; blank string → `MissingSub`.
Same boundary-validation rule as the rest of the codebase.
- `Bearer` parser is case-insensitive on the scheme, trims
surrounding whitespace, rejects empty tokens and non-Bearer
schemes.
- `DecodingKey` / `CachedKey` / `JwksCache` / `JwtVerifier`
deliberately don't derive `Debug` so a stray `{:?}` can't leak
raw key material into log sinks.
Tests (`tests/auth.rs`):
- Stand up a local axum mock JWKS server, generate an RSA-2048
keypair per-test, sign tokens with the matching private key.
- Happy paths: bare token + `Bearer <token>` header.
- Reject paths: expired (well past the default 60s clock-skew
leeway), wrong iss, wrong aud, unknown kid, missing sub, blank
sub, garbage token, missing kid header, JWKS unreachable
(bind+drop a port to get a guaranteed-dead URL).
- Cache hit: verify the same kid twice without round-tripping.
- 12 integration tests + 6 unit tests on `strip_bearer_prefix`.
All passing locally.
Deps added:
- jsonwebtoken 9 (runtime) for RS256/ES256 verification + JWKS
parsing.
- reqwest 0.12 promoted from dev-dep to runtime for the JWKS
fetcher (rustls-tls feature, no OpenSSL system dep).
- thiserror 2 (runtime) for `AuthError` — aligns with desktop
waveflow-core / waveflow's version.
- rsa 0.9, rand 0.8, base64 0.22 (dev-only) for the test keypair
generation + JWK n/e URL-safe base64 encoding.
What this is NOT (deferred):
- Middleware that runs the verifier per request — PR3.
- `AppState.jwt_verifier` wiring + Config env-var parsing — PR3.
- Better Auth deployment + retiring the X-User-Id shim — 1.d.2.
Signed-off-by: InstaZDLL <github.105mh@8shield.net>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAjout d'une couche complète de vérification JWT via JWKS distant avec cache synchronisé par ChangesVérification JWT avec JWKS distant
Sequence DiagramsequenceDiagram
participant Client
participant JwtVerifier
participant JwksCache
participant JwksServer as JWKS Server
participant Decoder as jsonwebtoken
Client->>JwtVerifier: verify_bearer("Bearer eyJ...")
JwtVerifier->>JwtVerifier: strip_bearer_prefix()
JwtVerifier->>JwtVerifier: decode header for kid
JwtVerifier->>JwksCache: resolve_kid(kid)
alt Cache hit and not expired
JwksCache-->>JwtVerifier: DecodingKey+alg
else Cache miss or expired
JwtVerifier->>JwksServer: GET /.well-known/jwks.json
JwksServer-->>JwtVerifier: JwkSet
JwtVerifier->>JwtVerifier: build_cached_key() for each JWK
JwtVerifier->>JwksCache: update cache + timestamp
JwksCache-->>JwtVerifier: DecodingKey+alg
end
JwtVerifier->>JwtVerifier: verify algorithm matches header.alg
JwtVerifier->>Decoder: decode_with_validation(token, key, Validation{iss,aud,exp})
Decoder-->>JwtVerifier: Claims (sub)
JwtVerifier->>JwtVerifier: trim & validate sub non-empty
JwtVerifier-->>Client: VerifiedClaims { sub }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/auth.rs`:
- Around line 273-291: Concurrent requests can cause redundant JWKS refreshes in
resolve_kid leading to a thundering herd; add a single-flight guard so only one
refresh_cache() runs at a time. Modify the auth struct to include a refresh
lock/flag (e.g., an AsyncMutex or a OnceCell/Notify used as a single-flight
gate) and change resolve_kid to check the cache, then attempt to acquire the
refresh guard before calling refresh_cache(); if another task is already
refreshing, await that guard and re-check the cache instead of triggering a
second refresh. Keep references to resolve_kid, refresh_cache, and self.cache in
your changes so the new guard logic wraps the current slow path without changing
cache semantics or error handling.
- Around line 258-267: La valeur "sub" extraite de token_data.claims.sub doit
être trimée avant la vérification et avant d'être renvoyée : récupérer la valeur
optionnelle (token_data.claims.sub), convertir en String/stripped variant en
supprimant les espaces en tête/fin, vérifier que la version trimée n'est pas
vide (sinon retourner AuthError::MissingSub) et construire VerifiedClaims { sub
} avec la valeur trimée; référencez les symboles token_data.claims.sub,
AuthError::MissingSub et VerifiedClaims pour localiser l'endroit à modifier.
- Around line 341-350: build_cached_key currently returns None when
jwk.common.key_algorithm is missing which causes refresh_cache to drop all such
keys and potentially trigger AuthError::EmptyJwks; change build_cached_key to
derive the jwt::Algorithm when jwk.common.key_algorithm is None by inspecting
the Jwk type (e.g., if common.key_type or rsa/modulus/exponent indicate an RSA
key, map to a safe allowlisted Algorithm like RS256 or consult a configured
allowlist), or otherwise apply a controlled default/allowlist rather than
returning None; update refresh_cache to include keys produced this way and add
an integration test that loads a JWKS containing an RSA JWK without an alg field
and asserts refresh_cache succeeds (no AuthError::EmptyJwks) and the key can
verify an RS256 token.
In `@tests/auth.rs`:
- Around line 366-392: The test cache_hit_skips_second_fetch currently only
asserts two successful verifies but doesn't prove no second JWKS fetch occurred;
instrument the mock JWKS server with an AtomicUsize counter and assert it equals
1 after both verify_token calls. Concretely: add an AtomicUsize in the test
harness/mock JWKS handler (tied to AuthHarness::spawn), increment it on each
JWKS request, expose a method or field on AuthHarness (e.g.
jwks_request_count()) and in cache_hit_skips_second_fetch call
verifier.verify_token(&token) twice then assert
harness.jwks_request_count().load(...) == 1 to prove the second verify used the
cache.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 91e968e5-3869-4172-8147-37227cfea923
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock,!Cargo.lock,!*.lock
📒 Files selected for processing (4)
Cargo.tomlsrc/auth.rssrc/lib.rstests/auth.rs
1. Single-flight gate on JWKS refresh (thundering herd):
- Add `refresh_lock: tokio::sync::Mutex<()>` to JwtVerifier.
- resolve_kid now follows a 3-phase pattern: fast-path read,
acquire single-flight gate, double-checked read (in case
another waiter already refreshed), then refresh. Only the
first concurrent verify on a cold/expired cache actually fires
a JWKS fetch; subsequent waiters see the populated cache.
2. Trim sub before storing in VerifiedClaims:
- POST /api/v1/users trims its external_id before persist + the
CHECK constraint enforces non-blank, so the persisted column
is always trimmed. Returning an untrimmed sub from the JWT
verifier would cause the future middleware's
`WHERE external_id = $sub` lookup to miss on a token whose
sub round-trips with stray whitespace. Now the verifier
trims + returns the trimmed form, keeping both sides of the
lookup in lockstep.
3. RFC 7517 §4.4 fallback for alg-less JWKs:
- build_cached_key now defaults to RS256 when an RSA JWK omits
`alg` (spec-optional). EC keys without `alg` are still
dropped — P-256/384/521 each map to a distinct Algorithm
(ES256/384/512) and silently picking one would either reject
legitimate tokens or accept the wrong curve.
- The token-header alg cross-check stays in place so an
algorithm-confusion downgrade (RS256 → HS256) still fails
closed.
- New test rsa_jwk_without_alg_defaults_to_rs256 verifies the
fallback wires the right Algorithm for the signature check.
4. cache_hit_skips_second_fetch now proves no second fetch:
- Mock JWKS handler bumps an AtomicUsize on every request,
exposed via AuthHarness::jwks_request_count().
- The test asserts counter == 1 after 2 verifies, so a future
regression that drops the cache hit path would surface here.
Signed-off-by: InstaZDLL <github.105mh@8shield.net>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/auth.rs (1)
257-269:⚠️ Potential issue | 🟠 Major | ⚡ Quick winÉviter que
jsonwebtokentransforme l’absence desubenAuthError::InvalidClaims.
Validation::set_required_spec_claims(&["sub"])fait échouerdecode()avant que votre code n’arrive auclaims.sub.ok_or(AuthError::MissingSub), donc un token sanssubremontera en erreur “missing required claim” (convertie ici enInvalidClaims). Le variantMissingSubne sera donc plus respecté.🔧 Correctif minimal
- validation.set_required_spec_claims(&["exp", "iss", "aud", "sub"]); + validation.set_required_spec_claims(&["exp", "iss", "aud"]);Alternative: garder
"sub"mais mapper explicitement l’erreurMissingRequiredClaim(poursub) versAuthError::MissingSub.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/auth.rs` around lines 257 - 269, The code currently calls Validation::set_required_spec_claims(&["exp", "iss", "aud", "sub"]) which causes decode::<RawClaims>(...) to fail with a MissingRequiredClaim for "sub" and be mapped to AuthError::InvalidClaims before your explicit claims.sub.ok_or(AuthError::MissingSub) can run; fix by either removing "sub" from Validation::set_required_spec_claims so decode succeeds and you can handle token_data.claims.sub with AuthError::MissingSub, or keep "sub" but change the map_err on decode to inspect the jwt::errors::Error (e.g. match ErrorKind::MissingRequiredClaim("sub")) and convert that specific case into AuthError::MissingSub while preserving other errors as AuthError::InvalidClaims.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/auth.rs`:
- Around line 257-269: The code currently calls
Validation::set_required_spec_claims(&["exp", "iss", "aud", "sub"]) which causes
decode::<RawClaims>(...) to fail with a MissingRequiredClaim for "sub" and be
mapped to AuthError::InvalidClaims before your explicit
claims.sub.ok_or(AuthError::MissingSub) can run; fix by either removing "sub"
from Validation::set_required_spec_claims so decode succeeds and you can handle
token_data.claims.sub with AuthError::MissingSub, or keep "sub" but change the
map_err on decode to inspect the jwt::errors::Error (e.g. match
ErrorKind::MissingRequiredClaim("sub")) and convert that specific case into
AuthError::MissingSub while preserving other errors as AuthError::InvalidClaims.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: d3a85a07-369d-46e4-8683-cc4b0373e562
📒 Files selected for processing (2)
src/auth.rstests/auth.rs
CR caught an observability hole: keeping `sub` in
Validation::set_required_spec_claims meant jsonwebtoken's structural
check would reject a sub-less token with `MissingRequiredClaim("sub")`,
which the outer `map_err` flattened into AuthError::InvalidClaims.
The discriminated MissingSub branch was unreachable for the
"missing field" case — only blank-string-after-trim could fire it.
The existing `rejects_missing_sub` test even acknowledged this by
asserting the disjunction `InvalidClaims | MissingSub`. With the
fix:
- Drop `"sub"` from set_required_spec_claims. `exp`/`iss`/`aud`
stay so signature trust check still rejects tokens that skip
them.
- Keep `sub: Option<String>` on RawClaims + explicit
`ok_or(MissingSub)` after decode. A token without `sub` now
reliably surfaces as MissingSub instead of being flattened.
- Tighten the test to assert specifically MissingSub (no more
disjunction) — a future regression that re-adds `sub` to the
required claims list trips here.
Doc-comment on the validation block now explains the rationale so
the next reader doesn't put `sub` back in "for completeness".
Signed-off-by: InstaZDLL <github.105mh@8shield.net>
|
@coderabbitai bien vu, observability hole valide. Appliqué dans 3b772a8 :
13/13 tests verts. |
|
Tip For best results, initiate chat on the files or code changes.
🐇 ✓ |
Summary
Second of 3 sub-PRs landing Phase 1.d.1 — the JWT verifier module + tests, NO wiring yet. The middleware that consumes it lands in PR3 alongside the
AppState.jwt_verifierfield and env-var parsing. This PR is a pure addition: no existing test or route changes, so a future revert is a clean back-out.Module surface
JwtVerifierConfig { jwks_url, issuer, audience }— caller-owned config.JwtVerifier::new(config)(default reqwest client, 5s timeout) andJwtVerifier::with_client(config, client)(test harness + future mTLS/retry callers).VerifiedClaims { sub }— what the middleware will consume in PR3.AuthError— discriminated failure types so the middleware can map each to its own HTTP status (401 for client errors, 503 for JWKS unreachable).Cache strategy
kid. Miss → fetch once → retry → if still missing,KeyNotFound(no retry loop, no DoS vector).kidrotation fails closed; Better Auth rotates by adding newkids alongside old ones for a grace window, so the common case works.Security hardening
algdisagrees with the cached key'salgis rejected withAlgorithmMismatch— blocks confused-deputy attempts to flip RS256 → HS256.subclaim is required + trimmed; blank string →MissingSub. Same boundary-validation rule as the rest of the codebase.Bearerparser is case-insensitive on the scheme, trims surrounding whitespace, rejects empty tokens and non-Bearer schemes.DecodingKey/CachedKey/JwksCache/JwtVerifierdeliberately don't deriveDebug— a stray{:?}can't leak raw key material into log sinks.Tests (
tests/auth.rs+ module unit tests)Mock JWKS server (local axum sub-app) serving the public half of an RSA-2048 keypair generated at test time. The private key signs every JWT.
12 integration tests :
verifies_a_valid_token,verifies_bearer_prefixrejects_expired_token(well past the default 60s leeway),rejects_wrong_issuer,rejects_wrong_audiencerejects_unknown_kid,rejects_missing_sub,rejects_blank_subrejects_garbage_token,rejects_token_without_kid_headerrejects_when_jwks_unreachable(bind + drop a port to get a guaranteed-dead URL)cache_hit_skips_second_fetch6 unit tests on
strip_bearer_prefix(canonical, case-insensitive, surrounding whitespace, wrong scheme, empty token, no space).All passing locally.
Deps added
jsonwebtoken = 9reqwest 0.12thiserror = 2AuthErrordiscriminated types — matches desktop's versionrsa = 0.9rand = 0.8base64 = 0.22n/ecomponentsWhat this is NOT
AppState.jwt_verifierfield + Config env-var parsing — PR3.Test plan
cargo check --all-targetscargo fmt --all --checkcargo clippy --all-targets -- -D warningscargo test --test auth(12 + 6 = 18 tests, all passing locally — no Postgres needed)cargo test --all(CI runs the full suite against the Postgres service container)Refs: RFC-001 §6.6.
Summary by CodeRabbit
New Features
Tests
Chores