Add authserver DCR credential store and resolver#5042
Conversation
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
abbc15a to
9b66570
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5042 +/- ##
==========================================
+ Coverage 67.29% 67.51% +0.21%
==========================================
Files 598 601 +3
Lines 60348 60969 +621
==========================================
+ Hits 40612 41163 +551
- Misses 16659 16693 +34
- Partials 3077 3113 +36 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
d578dfe to
000369f
Compare
9b66570 to
fc135b4
Compare
fc135b4 to
3478e8c
Compare
000369f to
9ea4a15
Compare
3478e8c to
cc43b14
Compare
e6588ea to
7dc73f8
Compare
cc43b14 to
5ad7473
Compare
Implements Phase 2 steps 2f/2c of the DCR story (#5038): - DCRCredentialStore interface with in-memory implementation keyed by (Issuer, RedirectURI, ScopesHash). No TTL — RFC 7591 registrations are long-lived and only purged by explicit deregistration. The ScopesHash tuple is designed to compose a Redis key segment in Phase 3 without redefining the canonical form. - resolveDCRCredentials performs cache lookup, authorization server metadata discovery (or direct registration endpoint), explicit endpoint override, auth-method intersection against the preference order private_key_jwt > client_secret_basic > client_secret_post > none, synthesis of {origin}/register when metadata omits registration_endpoint, and a single call to oauthproto.RegisterClientDynamically with the initial access token injected as Authorization: Bearer via a wrapping RoundTripper. - Resolution stored before returning, per the write-durable-first rule in .claude/rules/go-style.md. Address code review feedback on DCR resolver Fixed issues from code review: - HIGH: DCRUpstreamConfig.DiscoveryURL now honoured. Added oauthproto.FetchAuthorizationServerMetadataFromURL which fetches the operator-configured URL exactly (no well-known-path fallback) and enforces RFC 8414 §3.3 issuer equality. Updated field doc comments to match behaviour. - HIGH: TestResolveDCRCredentials_CacheHitShortCircuits rewritten to wire the count-everything server as the issuer and assert totalRequests == 0, so a regression that moved discovery before the cache lookup would actually fail the test. - MEDIUM: Added tests for discovered-scopes fallback and empty-scope omitted-field branches. - MEDIUM: Exposed oauthproto.NewDefaultDCRClient so newDCRHTTPClient inherits the canonical DCR timeout policy rather than duplicating it — future timeout changes propagate automatically. - MEDIUM: Replaced hand-rolled readAll / readAllResp helpers with io.ReadAll. - MEDIUM: Documented that direct RegistrationEndpoint disables server-capability negotiation. - MEDIUM: Documented that applyResolution always overwrites ClientID and explained the precondition enforcement. - MEDIUM: scopesHash now deduplicates (slices.Compact) so scope sets that differ only by duplicated values share a canonical digest. Test subtests updated accordingly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5ad7473 to
80d5cbe
Compare
Closes the 14 review threads from review #4199898531 that are scoped to this PR; #3 and #5 are intentionally deferred to PR #5044 which is the PR that consumes the ClientSecret plumbing and RedirectURI propagation. Behaviour fixes: - #2 synthesiseRegistrationEndpoint and resolveUpstreamRedirectURI now preserve the issuer's path component, so multi-tenant upstreams that ship DCR without advertising it (or this auth server when it itself has a tenant prefix in its issuer) keep their tenant prefix in the synthesised URL and the defaulted redirect URI. - #7 resolveDCRCredentials wraps registration in a package-level singleflight.Group keyed by DCRKey so concurrent callers for the same upstream coalesce into a single RFC 7591 registration. The leader's closure rechecks the cache before any network I/O so followers that arrive just after a Put returns observe the fresh entry. - #9 applyResolution writes ClientID only when the run-config copy leaves it empty, treating it symmetrically with the endpoint writes. This is defence-in-depth for callers that bypass the resolver's validateResolveInputs precondition check. - #10 the cfg.RegistrationEndpoint short-circuit branch in resolveDCREndpoints validates URL well-formedness and HTTPS-or- loopback locally, so a non-HTTPS endpoint fails before performRegistration constructs a bearer-token transport for it. - #13 endpointsFromMetadata validates discovered authorization_endpoint and token_endpoint before populating dcrEndpoints, so a self- consistent metadata document with http:// URLs cannot flow through to the auth-code or token-exchange path. Doc / scope: - #6 trust-assumption note added on DCRUpstreamConfig with a link to the follow-up SSRF-hardening issue (#5135) per the "Document Architectural Constraints" rule. - #8 NewInMemoryDCRCredentialStore doc now explicitly calls out what the in-memory store does and does NOT solve (cross-replica sharing, durability, cross-process write coordination). Test hygiene: - #4 deleted the literal-equals-literal dcrStaleAgeThreshold test that added no meaningful coverage. - #11 atomic.AddInt32 / LoadInt32 for wellKnownHits / customHits in TestFetchAuthorizationServerMetadataFromURL, matching the sibling pattern in dcr_test.go. - #12 new failingDCRStore test double + two cases asserting the cache.Get / cache.Put wrap messages ("dcr: cache lookup", "dcr: cache put") that operators see when a future Redis backend errors. - #14 renamed the TestNeedsDCR case from "client_id with dcr (invalid but returns false)" to "client_id wins over dcr_config (defensive AND semantic)" so the case label describes the intentional defensive behaviour rather than reading like a bug report. - #15 removed a //nolint:lll directive that was suppressing a non- existent violation (the line is 108 chars; the limit is 130). - #16 new TestInMemoryDCRCredentialStore_ConcurrentAccess fans out 16 goroutines doing alternating Put/Get against overlapping and disjoint keys, with a fail-fast deadline, so go test -race actually exercises the sync.RWMutex guard. New tests cover each behaviour change: - TestResolveDCRCredentials_SingleflightCoalescesConcurrentCallers pins exactly-one-registration under N concurrent callers. - TestSynthesiseRegistrationEndpoint_PreservesIssuerPath / TestResolveUpstreamRedirectURI_PreservesIssuerPath cover the path- preservation fix. - TestApplyResolution_DoesNotOverwritePreProvisionedClientID covers the symmetric ClientID write. - TestResolveDCREndpoints_DirectRegistrationEndpointValidated covers the local registration-endpoint guard. - TestEndpointsFromMetadata_RejectsInsecureDiscoveredEndpoints covers the metadata authz/token endpoint guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
556c957 to
6ec02c0
Compare
|
While reviewing I noticed we're about to ship two DCR-client implementations: this resolver and the CLI flow in A shared |
jhrozek
left a comment
There was a problem hiding this comment.
Inline review of the DCR resolver scaffolding — seven questions / suggestions across dcr.go, dcr_test.go, and dcr_store.go. Top-level architectural note posted separately.
RFC 7636 §4.2 / OAuth 2.1 require S256 PKCE for public clients. The resolver previously accepted the "none" token-endpoint auth method whenever an upstream advertised it, with no check against code_challenge_methods_supported. Registering as "none" against an upstream that only advertises "plain" — or omits the field entirely — would either break at runtime (our auth-code path always sends code_challenge_method=S256) or silently downgrade to a non-compliant plain-PKCE flow. - dcrEndpoints now carries codeChallengeMethodsSupported, populated from metadata.CodeChallengeMethodsSupported in endpointsFromMetadata. - selectTokenEndpointAuthMethod takes the slice as a second argument and skips "none" when S256 is not advertised, falling through to the next less-preferred method. When "none" is the only mutually supported method and S256 is missing, the function returns an explicit error citing the spec so operators see a clear failure at boot rather than a silent runtime downgrade. - TestResolveDCRCredentials_AuthMethodPreference's "falls back to none" case now also advertises S256 to reflect the compliant flow. - New TestResolveDCRCredentials_RefusesNoneWithoutS256 pins the rejection in both shapes the upstream can take it (field omitted, field lists only plain). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tgrunnagle
left a comment
There was a problem hiding this comment.
Multi-Agent Consensus Review
Agents consulted: security, concurrency, test-coverage, error-handling/api-design, architecture, oauth-protocol, code-quality (general). Codex cross-review: skipped (codex CLI not installed).
Consensus Summary
| # | Finding | Consensus | Severity | Action |
|---|---|---|---|---|
| F1 | resolveDCRCredentials doc misidentifies the issuer parameter as the upstream's |
9/10 | HIGH | Fix doc / consider rename |
| F2 | Direct RegistrationEndpoint path: nothing forces caller to set AuthorizationEndpoint/TokenEndpoint |
8/10 | MEDIUM | Tighten OAuth2UpstreamRunConfig.Validate |
| F3 | endpointsFromMetadata derefs metadata without nil guard |
7/10 | LOW | Add nil guard |
| F4 | singleflight returns shared *DCRResolution to followers |
7/10 | LOW | Document or copy |
| F5 | Synthesised registration endpoint not re-validated for HTTPS-or-loopback | 7/10 | LOW | Add validateUpstreamEndpointURL call |
| F6 | chooseRegistrationScopes warning logs the wrong issuer (auth-server vs. upstream) |
7/10 | LOW | Pass upstream issuer (related to F1) |
| F7 | validateResolveInputs nil-rc / empty-issuer / nil-cache branches uncovered |
7/10 | LOW | Add table rows |
| F8 | synthesiseRegistrationEndpoint error paths uncovered |
7/10 | LOW | Add error rows |
| F9 | deriveExpectedIssuerFromDiscoveryURL missing-host branch uncovered |
7/10 | LOW | Add 1 row |
| F10 | buildResolution fallback to sent auth method uncovered |
7/10 | LOW | Add 1 test |
| F11 | dcrStaleAgeThreshold is dead code in this PR (nolint:unused) |
9/10 | INFO | Acknowledge / track for #5039 |
Overall
The PR delivers Phase 2 of the DCR story with strong defensive engineering: HTTPS-or-loopback enforced at every URL boundary, RFC 8414 §3.3 issuer-equality, a registration redirect-block that prevents bearer-token leakage to foreign origins (pinned by _DoesNotForwardBearerOnRedirect with a foreign-origin tripwire), defensive copies on the cache, and singleflight-coalesced registration to prevent orphaned clients at the upstream. Test coverage is dense (~30 functions, 1216 lines on the resolver alone) with explicit security-property tests and a table-driven concurrent-store stress test bounded by a fail-fast deadline. Module boundaries are clean — protocol primitives stay in pkg/oauthproto, runtime resolution and the bearer-token transport stay in pkg/authserver/runner. The PR description and doc-comment density are exemplary.
The blocking concern is F1: the public doc on resolveDCRCredentials says issuer is the upstream's identifier used both for discovery and for keying the cache, but the implementation never uses issuer for discovery — deriveExpectedIssuerFromDiscoveryURL(cfg.DiscoveryURL) recovers the upstream issuer separately. The inline comment in registerAndCache and the test _UpstreamIssuerDerivedFromDiscoveryURL both confirm that the parameter actually names this auth server (used for redirect-URI defaulting and cache keying). The next caller (#5039) reading the public doc will pass the wrong identifier, producing a wrong-origin default redirect URI under the upstream and a wrong cache key. Either fix the doc or rename the parameter (localIssuer / authserverIssuer); the same ambiguity is on DCRKey.Issuer in dcr_store.go.
F2 is the next concern: when an operator chooses dcr_config.registration_endpoint (bypass discovery), the doc on RegistrationEndpoint warns them to also supply AuthorizationEndpoint/TokenEndpoint/Scopes, but OAuth2UpstreamRunConfig.Validate does not enforce this. A misconfigured run-config registers a client successfully and silently fails at first authorize/token-exchange call. Per .claude/rules/go-style.md Constructor Validation: Fail Loudly on Invalid Input, this should fail at startup.
The remaining findings (F3–F11) are individually small but mostly mechanical: a defensive nil-guard, a singleflight pointer-aliasing note, a missed re-validation of one synthesised URL, a logging-issuer mix-up that follows naturally from F1, and four test-coverage gaps each a row or two of additions. F11 is a tracked forward-prep nolint:unused constant that is fine as-is provided #5039 lands on schedule.
No security-property regression and no breakage of the pre-provisioned-client_id path. Recommendation: COMMENT — F1/F2 are worth addressing before merge but neither blocks correctness today (the wiring caller does not yet exist).
Documentation
- F1's recommendation should propagate to the doc on
DCRKey.Issuerinpkg/authserver/runner/dcr_store.go, which currently readsthe authorization server's issuer identifier— same auth-server-vs-upstream ambiguity.
Generated with Claude Code
oauthproto decodes RFC 7591 §3.2.1 client_id_issued_at and
client_secret_expires_at, but buildResolution dropped both. The
authoritative consequence was for client_secret_expires_at: a non-zero
upstream-asserted expiry was invisible to lookupCachedResolution, so
the cache served an expired secret indefinitely, every token-endpoint
call after the upstream's expiry would 401, and the resolver had no
signal to refetch.
The 90-day dcrStaleAgeThreshold added for Step 2g observability is a
separate, heuristic signal ("consider rotating") and does not address
this — the upstream-asserted expiry is authoritative.
- DCRResolution now carries ClientIDIssuedAt and ClientSecretExpiresAt
as time.Time. The wire convention "0 means absent / does not expire"
maps to the zero time.Time so callers can use IsZero() uniformly,
via a small epochSecondsToTime helper.
- buildResolution populates both fields by converting the int64 epoch
values from the registration response.
- lookupCachedResolution treats an entry with a non-zero
ClientSecretExpiresAt that has passed as a cache miss, so the
singleflight body re-runs registerAndCache and overwrites the stale
entry via the existing cache.Put. A Debug log attributes the
refetch to client_secret_expires_at so operators can correlate the
re-registration with the upstream's expiry rather than guessing.
Tests:
- TestBuildResolution_PopulatesRFC7591ExpiryFields covers all three
shapes (both populated, expires_at omitted, both omitted) and that
zero on the wire becomes the zero time.Time.
- TestResolveDCRCredentials_RefetchesOnExpiredCachedSecret pins the
cache-miss-on-expiry behaviour: an upstream that returns an already-
expired secret causes back-to-back resolver calls to register twice.
- TestResolveDCRCredentials_HonoursFutureExpiryAndZero pins the cache-
hit path: future expiry and the zero-means-does-not-expire wire
convention both serve from the cache without re-registering.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
singleflight.Group re-panics the leader's panic in every follower, so without recovery a panic in registerAndCache (or anything it calls — the registration HTTP exchange, the cache.Put, etc.) would crash the leader goroutine and every concurrent follower for the same DCRKey with the same value. Net effect: a single bad upstream response could take out N goroutines instead of one error path. - The dcrFlight.Do closure now installs a defer/recover that converts a panic into a normal error. The recovered value is logged at slog.Error with a full debug.Stack() so the panic is not silently swallowed; the returned error wraps the recovered value so callers can react to it as a normal failure mode. - Doc comment on the singleflight call expanded to explain why the recover is necessary (the fan-out semantics of singleflight). Tests: - panickingPutDCRStore: a new test double whose Put panics with a configured value while Get returns a normal cache miss, so callers reach the singleflight closure and trigger the panic via cache.Put inside registerAndCache. - TestResolveDCRCredentials_RecoversPanicInsideSingleflight: spawns N=6 concurrent callers for the same DCRKey, each with its own defer/recover so an un-recovered panic is captured as a test failure rather than crashing the suite. Asserts every caller received an error containing the panic-marker string and the recovered value, and that none of them observed an actual panic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The existing assertion "registrationCalls == 1" is necessary but not sufficient: a goroutine that reached resolveDCRCredentials after the leader's cache.Put returned would short-circuit through lookupCachedResolution, take the cache hit, and still leave the count at 1. Under CI load the 50ms gate window was easy to miss, which would silently weaken what the test claims to cover. - Wrap the in-memory store in a countingStore decorator that records every Get that returned a hit. - Assert cache.hits.Load() == 0 after wg.Wait — any non-zero value means a late-arriver took the lookupCachedResolution short-circuit instead of joining the singleflight, and the coalescing path was not actually exercised. - Bump the gate window from 50ms to 250ms for headroom under CI load. The countingStore assertion turns any remaining timing slip into a loud failure rather than a silent regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous TestResolveDCRCredentials_RequiresClientIDEmpty and TestResolveDCRCredentials_RequiresDCRConfig each exercised one branch of validateResolveInputs, leaving three uncovered: nil run-config, empty issuer, and nil credential store. Per .claude/rules/testing.md "Consolidation", folding into one table- driven test covers every branch in one place and shrinks duplicated setup. - Replace the two single-branch tests with TestResolveDCRCredentials_RejectsInvalidInputs, which iterates over all five branches and asserts the wrapped error message contains the expected substring drawn from validateResolveInputs's literals. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer asked us to confirm the framing and document it on the constructor. The unbounded-cache concern in .claude/rules/go-style.md "Resource Leaks" does not bite for this implementation because the key space is bounded by the operator-configured upstream count and because Phase 3 Redis is the production answer; the doc now states that explicitly so a future reader does not assume this is a production-grade store. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A nil resolution previously fell through as a silent no-op: the caller got a successful return, the next Get returned a miss, and there was no error trail to debug from. Per the constructor-validation rule in .claude/rules/go-style.md, fail loudly at the boundary. - inMemoryDCRCredentialStore.Put returns an explicit error when the resolution is nil, before taking the lock. - DCRCredentialStore.Put interface doc records the contract so future backends (e.g. the Phase 3 Redis store) behave consistently. - New TestInMemoryDCRCredentialStore_Put_RejectsNilResolution pins the rejection and asserts the rejected Put did not leave any partial entry behind. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The resolveDCRCredentials doc said "issuer is the upstream authorization server's issuer identifier used both for discovery and for keying the cache." That was wrong on both counts: the upstream's issuer is recovered separately from rc.DCRConfig.DiscoveryURL via deriveExpectedIssuerFromDiscoveryURL, and the parameter actually names *this* auth server. An inline comment in registerAndCache already contradicted the public doc, and a future caller reading only the public doc would have produced (a) a wrong-origin default redirect URI under the upstream IdP and (b) a cache key that does not identify the auth-server context. Renaming the parameter and documenting the contract closes the trap. - resolveDCRCredentials: parameter renamed issuer → localIssuer; doc rewritten to spell out that it is *this* auth server's issuer, used to key the cache and default the redirect URI, and that the upstream's issuer is recovered separately. Same rename applied to validateResolveInputs, registerAndCache, lookupCachedResolution, chooseRegistrationScopes, and resolveUpstreamRedirectURI for internal consistency. - endpointsFromMetadata, synthesiseRegistrationEndpoint: parameter renamed issuer → upstreamIssuer (it is in fact the upstream's, recovered from the DiscoveryURL) so the local/upstream split is symmetric. - DCRKey.Issuer: doc clarifies the field carries *this* auth server's local issuer; two different local issuers registering against the same upstream are distinct OAuth clients and must not share cache entries. - Structured-log keys "issuer" → "local_issuer" everywhere the value is the local issuer, so operators can tell the two apart in logs too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When dcr_config.registration_endpoint is set, the resolver bypasses RFC 8414 / OIDC Discovery and therefore cannot populate authorization_endpoint or token_endpoint from server metadata. The operator-facing prose on DCRUpstreamConfig already says the run-config must supply both, but neither validator enforced it. The downstream effect was silent: dcrEndpoints came back with empty authorize/token URLs, applyExplicitEndpointOverrides only filled them from rc, and applyResolution only wrote when rc fields were empty — so a misconfigured run-config completed registration successfully and only broke at the first authorize or token-exchange call. - OAuth2UpstreamRunConfig.Validate now rejects a config that sets DCRConfig.RegistrationEndpoint without both AuthorizationEndpoint and TokenEndpoint, with an error message that names the field and explains the discovery-bypass rationale. - The discovery-flow path (DCRConfig.DiscoveryURL) is unaffected: metadata populates the endpoints there, so leaving them empty in the run-config remains valid. - Existing test case "DCRConfig with only registration_endpoint is valid" updated to also set authorize/token endpoints (the path it was claiming to cover); two new cases pin the rejection when each endpoint is missing; one new case pins that DiscoveryURL is still valid without explicit endpoints. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
jhrozek
left a comment
There was a problem hiding this comment.
Seven inline comments addressed across dedicated commits — each fix matches or exceeds the suggestion. Two extra commits close adjacent footguns (local/upstream issuer disambiguation, explicit-endpoints validation when discovery is bypassed). LGTM.
Summary
Upstream Identity Providers that support RFC 7591 Dynamic Client Registration can now be configured without a pre-provisioned
client_id. At startup the authserver discovers the registration endpoint, registers itself once per(issuer, redirect_uri, scopes)tuple, and caches the resulting credentials for the process lifetime. This unblocks the remaining wiring + observability work in sub-issue C and removes the last manual provisioning step for DCR-capable upstreams (e.g., nanobot, Hydra).client_id/client_secretinto the run config. Phase 2 of the DCR story (Authserver DCR integration (Phase 2, Steps 2a-2g) #4978) lets the authserver do this at startup, keyed canonically so a future Redis backend (Phase 3) can share the cache across replicas without redefining the key shape.DCRCredentialStoreinterface + in-memory implementation keyed by(Issuer, RedirectURI, ScopesHash).ScopesHashis SHA-256 of the deduplicated + sorted scope list, so permuted or duplicated scope inputs share a digest. No TTL, no eviction — RFC 7591 registrations are long-lived and are only revoked by explicit RFC 7592 deregistration.resolveDCRCredentialsperforms (1) cache lookup (short-circuits before any network I/O on hit), (2) authorization-server metadata discovery viaFetchAuthorizationServerMetadatawhenDiscoveryURLis set, or the operator-suppliedRegistrationEndpointdirectly, (3) explicitrc.AuthorizationEndpoint/rc.TokenEndpointoverride of discovered values, (4) redirect-URI derivation from the issuer origin +/oauth/callbackwith HTTPS enforcement outside loopback, (5) auth-method intersection against the preference orderprivate_key_jwt > client_secret_basic > client_secret_post > none, (6) synthesis of{origin}/registerwhen metadata omitsregistration_endpoint(nanobot/Hydra convention), (7) a single call tooauthproto.RegisterClientDynamicallywith the initial access token injected asAuthorization: Bearervia a wrappingRoundTripper, and (8)DCRCredentialStore.Putbefore returning — per the write-durable-first rule in.claude/rules/go-style.md.DCRResolution(RegistrationAccessToken,RegistrationClientURI,TokenEndpointAuthMethod,CreatedAt) so Phase 3 has everything it needs for management operations.DCRUpstreamConfig.DiscoveryURLis honoured exactly (no well-known-path fallback) via the newoauthproto.FetchAuthorizationServerMetadataFromURL, with the RFC 8414 §3.3 issuer-equality check preserved.Closes #5038
Type of change
Test plan
task test)task lint-fix)New test coverage (
pkg/authserver/runner/dcr_store_test.go,pkg/authserver/runner/dcr_test.go):DCRKeyvalues don't collide;ScopesHashis stable across permuted and duplicated scope order.httptest.NewServermounting AS metadata + DCR endpoint.AuthorizationEndpointoverrides discovered value.Authorization: Bearer ….client_secret_basicchosen overnonewhen both are advertised.registration_endpointtriggers the synthesized/registerpath.grep -nE '(client_secret|registration_access_token|initial_access_token|refresh_token)' pkg/authserver/runner/dcr.goreturns only struct-field / JSON-tag hits — noslog.*leakage.API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.No CRD changes. The new
DCRConfigfield onOAuth2UpstreamRunConfigis additive and gated by an exactly-one-of validator againstClientID.Changes
pkg/authserver/runner/dcr_store.goDCRCredentialStoreinterface,DCRKey, in-memory implementation,dcrStaleAgeThresholdconstant.pkg/authserver/runner/dcr_store_test.gopkg/authserver/runner/dcr.goDCRResolution,needsDCR,applyResolution,scopesHash, auth-method intersection,resolveDCRCredentials.pkg/authserver/runner/dcr_test.go/register.pkg/authserver/config.goDCRUpstreamConfig+DCRConfigfield onOAuth2UpstreamRunConfig;Validateenforces exactly-one-of forClientIDvsDCRConfigandDiscoveryURLvsRegistrationEndpoint.pkg/authserver/config_test.gopkg/oauthproto/discovery.goFetchAuthorizationServerMetadata(three-path RFC 8414 / OIDC Discovery fallback, bounded per-call timeout, issuer-equality check) andFetchAuthorizationServerMetadataFromURL(exact fetch, no fallback).ErrRegistrationEndpointMissingsentinel.pkg/oauthproto/discovery_test.gopkg/oauthproto/dcr.gopkg/auth/oauth/dynamic_registration.go;RegisterClientDynamicallynow takes*http.Clientdirectly (nil builds default);NewDefaultDCRClientexposed so callers inherit the canonical timeout policy.pkg/oauthproto/{constants,errors,redirect,locality}.gopkg/oauth;IsLoopbackHostrelocated here withpkg/networking.IsLocalhostkept as a thin wrapper.pkg/oauth/*pkg/oauthproto; 8 redundant aliases dropped.Does this introduce a user-facing change?
Yes. Operators can now omit
client_idon anOAuth2UpstreamRunConfigand instead setdcr_configwith either adiscovery_urlor a directregistration_endpoint. The authserver will register itself with the upstream at startup and cache the result in memory for the process lifetime. The existing pre-provisioned-client_idpath is unchanged.Implementation plan
Approved implementation plan
Phase 2 of the DCR story (#4978), delivered across four commits on this branch:
Phase 1 (
3cb8de58) — Renamepkg/oauth→pkg/oauthprotoand consolidate DCR primitives intopkg/oauthproto/dcr.go. RelocateIsLocalhostlogic intopkg/oauthproto/IsLoopbackHost, keeppkg/networking.IsLocalhostas a thin wrapper.RegisterClientDynamicallytakes*http.Clientdirectly. HostNewDynamicClientRegistrationRequestinpkg/auth/discovery/dcr_request.go(sole caller; loopback redirect-URI assumption stays out of the protocol package).Step 2a/2b/2e (
d578dfee) — AddFetchAuthorizationServerMetadatatopkg/oauthproto(three-path RFC 8414 / OIDC Discovery fallback, bounded per-call timeout, RFC 8414 §3.3 issuer equality check,ErrRegistrationEndpointMissingsentinel). AddDCRUpstreamConfigstruct andDCRConfigfield onOAuth2UpstreamRunConfig. Exactly-one-of validation forClientIDvsDCRConfigandDiscoveryURLvsRegistrationEndpoint. Purely additive: no behaviour change, no CRD changes.Step 2f/2c (
e965b86d) — AddDCRCredentialStoreinterface + in-memory implementation keyed by(Issuer, RedirectURI, ScopesHash). AddresolveDCRCredentials: cache lookup, metadata discovery or direct registration endpoint, explicit endpoint override, auth-method intersection, synthesis of{origin}/registerwhen metadata omits it, single call tooauthproto.RegisterClientDynamicallywith initial access token injected via a wrappingRoundTripper,Putbefore returning.Review iterations (
abbc15ab+ amendments in earlier commits) — Address HIGH / MEDIUM findings:DCRUpstreamConfig.DiscoveryURLexactly viaFetchAuthorizationServerMetadataFromURL(no well-known-path fallback on an operator-supplied URL).NewDefaultDCRClientsonewDCRHTTPClientinherits the canonical timeout.scopesHashdeduplicates viaslices.Compact.applyResolutiondoc comment clarifies it overwritesClientID.RegistrationEndpointpath documented as disabling server-capability negotiation.io.ReadAllreplaces hand-rolled body readers; response bodies drained before close in error paths.Final review state: 0 CRITICAL / 0 HIGH / 0 MEDIUM.
Large PR Justification