fix(auth): log per-authenticator rejections instead of swallowing them#3
Merged
Conversation
Closes #2. Before: Chain.Authenticate dropped every error from OIDCValidator.ValidateBearer and APIKeyStore.Authenticate, falling through to the next method or to ErrNotAuthenticated. When OIDC rejected a bearer for any reason (audience mismatch, JWKS miss, expired, signature failure), the calling client got a 401 with no body, no log, no diagnostic surface. After: Chain owns a *slog.Logger (defaulting to slog.Default(), overridable via WithLogger). Every rejection from a configured authenticator is logged at WARN with method=oidc|apikey and the underlying error before the chain falls through. The token is never included in the log. internal/server/server.go now wires the application's logger into the chain via NewChain(...).WithLogger(logger), so log lines are attributed under the same JSON handler as the rest of the binary. Tests - pkg/auth/chain_test.go covers: empty token (no log), success path (no log), JWT-shaped token tries OIDC first, non-JWT tries apikey first, OIDC and API key rejections each emit one log line with the right method tag, full-chain miss emits both logs independently, token bytes never appear in log output, WithLogger honors nil → slog.Default() fallback. Out of scope (left for a follow-up): RFC 6750 §3 WWW-Authenticate error parameter on MCPAuthGateway. The issue called that optional and the diagnostic value is largely captured by the WARN logs the chain now emits.
Builds on the initial chain-logging change with the findings from the critical review of PR #3, plus the RFC 6750 §3 enhancement on the HTTP gateway that issue #2 called out as an optional follow-up. pkg/auth/chain.go - Token redaction: redactTokenLikes strips JWT-shape substrings (xxx.yyy.zzz of base64url segments >=8 chars) from the validator's error before logging. Defense in depth on top of the validators-don't-echo-tokens contract; the chain now actually enforces it instead of just trusting downstream. - Rate-limited WARN: first failure per remote_addr per 60s window logs at WARN; subsequent failures in the same window log at DEBUG. failureRateLimiter does opportunistic eviction so the ledger stays bounded under sustained pressure. Empty remote_addr (no correlation context) bypasses the bucket. - Correlation fields: log line now carries request_id and remote_addr from ctx alongside method and error, joinable against audit_events. - API: WithLogger renamed to SetLogger to be honest about the receiver mutation; WithLogger kept as a Deprecated alias for source-compat. - Removed the dead-code nil check on c.logger; NewChain always sets it. - Switched from c.logger.Warn(...) variadic to slog.LogAttrs with typed slog.String attrs so behavior is identical across stock and custom slog handlers. pkg/httpsrv/authgate.go - 401 response now follows RFC 6750 §3: WWW-Authenticate: Bearer realm="mcp-test", error="invalid_request", error_description="missing or unsupported credential...", resource_metadata="<url>" buildBearerChallenge composes the challenge with proper RFC 7235 §2.1 quote escaping (backslash and double-quote escaped per the auth-param ABNF). The JSON body mirrors error_description so non-RFC-aware clients also see the diagnostic. internal/server/server.go - Wires the application logger via NewChain(...).SetLogger(logger). Tests - pkg/auth/chain_test.go: real redaction test that constructs a leaky validator embedding the rejected token in its error and asserts the chain redacts it. redactTokenLikes table. failureRateLimiter behavior (first-warns-then-demotes, per-key independence, window elapse, empty-key bypass, nil-receiver safety, default-window fallback, stale-entry eviction). Both-stores-nil-with-token case. Anonymous-empty-token no-log case. WithLogger deprecated alias. - pkg/httpsrv/authgate_test.go: RFC 6750 §3 challenge field assertions (realm, error, error_description, resource_metadata). buildBearerChallenge blank-field omission. quoteAuthParam escape matrix. make verify is green at >=80% filtered coverage.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2.
Summary
Chain.Authenticatewas dropping every error fromOIDCValidator.ValidateBearerandAPIKeyStore.Authenticate, falling through to the next method or toErrNotAuthenticated. When OIDC rejected a bearer for any reason (audience mismatch, JWKS miss, expired-with-clock-skew, signature failure), the calling client got a401with no body, no log, no diagnostic surface.This PR makes the chain log every per-authenticator rejection, attaches request correlation, redacts JWT-shape token bytes, rate-limits per remote_addr, and has
MCPAuthGatewayemit a proper RFC 6750 §3WWW-Authenticatechallenge so MCP clients see a useful diagnostic alongside the 401.Changes
pkg/auth/chain.goChainowns a*slog.Logger(defaults toslog.Default(), overridable viaSetLogger).WithLoggeris kept as a deprecated alias so existing callers don't break.method=oidc|apikeyplus the underlying error and request correlation fields (request_id,remote_addr) pulled from ctx.redactTokenLikesstrips JWT-shape substrings (xxx.yyy.zzzof base64url segments ≥8 chars) from the validator's error string before logging. Defense-in-depth on top of "validators don't echo tokens"; we now actually enforce it at the chain layer rather than just trusting downstream.remote_addrper 60s window logs at WARN; subsequent failures in the window log at DEBUG. Without this, a scanner hitting 401s at line speed would flood operator logs and crowd out real signal. Emptyremote_addr(no correlation context available) bypasses the bucket so single-shot failures still surface at WARN.NewChain(allowAnonymous, apiKeys, oidc)is unchanged.pkg/httpsrv/authgate.goMCPAuthGateway's 401 response now follows RFC 6750 §3:error/error_descriptionto the user. The JSON body now includeserror_descriptiontoo, for non-RFC-aware clients.buildBearerChallengecomposes the challenge with proper RFC 7235 §2.1 quote escaping ("and\are escaped per the auth-param ABNF).internal/server/server.goNewChain(...).SetLogger(logger)so log lines come out under the same JSON handler as the rest of the binary.New tests
pkg/auth/chain_test.gocovers:ErrNotAuthenticated).ErrNotAuthenticated).request_id+remote_addrcorrelation in the log line.redactTokenLikestable covering classic three-segment JWT, no-JWT, mid-sentence JWT, short non-JWT dotted identifiers.failureRateLimiter: first warns then demotes within window, different keys are independent, window elapse re-warns, empty key never bucketed, nil-receiver safe, default-window fallback, stale-entry eviction.SetLogger(nil)falls back toslog.Default();NewChainsets the logger and rate limiter.WithLoggerdeprecated alias still works.pkg/httpsrv/authgate_test.goadds:realm,error,error_description,resource_metadata).error_descriptionmirrored into the JSON body.buildBearerChallengeomits blank fields.quoteAuthParamescapes"and\per auth-param ABNF.Sample log output
For an OIDC token rejected on audience mismatch (first failure from
10.0.0.7):{"time":"2026-04-30T07:42:18Z","level":"WARN","msg":"auth: token rejected","method":"oidc","error":"audience mismatch: want \"prod\"","request_id":"req-abc-123","remote_addr":"10.0.0.7"}A second failure from the same source within 60s gets demoted to DEBUG; identical content otherwise.
For a leaky validator that included the rejected JWT in its error:
{"time":"...","level":"WARN","msg":"auth: token rejected","method":"oidc","error":"rejected token: [redacted-jwt]","request_id":"...","remote_addr":"..."}Test plan
make verifypasses (lint + race-tested test suite + 80% coverage gate).slog.LogAttrswith typed string attrs,SetLoggerrename with deprecated alias, both-stores-nil test added.method=oidc, the audience-mismatch error, and therequest_idmatching the audit_events row. Subsequent failures from the same source in the same minute appear at DEBUG.WWW-Authenticate: Bearer realm="mcp-test", error="invalid_request", error_description="...", resource_metadata="..."and the JSON body haserror_description.