feat: refresh upstream credentials when external secrets expire#1689
feat: refresh upstream credentials when external secrets expire#1689
Conversation
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
c760fbb to
f52b485
Compare
c545e38 to
e3fde7c
Compare
e3fde7c to
04f89e4
Compare
04f89e4 to
709d2a2
Compare
🚀 Preview Environment (PR #1689)Preview URL: https://pr-1689.dev.getgram.ai
Gram Preview Bot |
709d2a2 to
0e7dddb
Compare
0e7dddb to
932b94e
Compare
| func (p *GramProvider) RefreshToken( | ||
| _ context.Context, | ||
| _ string, | ||
| _ repo.OauthProxyProvider, | ||
| _ *toolsets_repo.Toolset, | ||
| ) (*TokenExchangeResult, error) { | ||
| return nil, fmt.Errorf("refresh token not supported for gram provider") | ||
| } |
There was a problem hiding this comment.
Is there another way to go about this? Personally not keen on relying on a runtime error to surface an erroneous usage of the method.
There was a problem hiding this comment.
there might be but I think the responsibility should fall to the external oauth server config in this case (which I believe captures this case by not advertising the access_token grant) - but on some level Gram will surface a token endpoint and if we find a Gram provider we should probably error on it in a manner at least similar to this. Let me verify that behavior though
| var scopesSupported []string | ||
| providers, err := oauthRepo.ListOAuthProxyProvidersByServer(ctx, repo.ListOAuthProxyProvidersByServerParams{ | ||
| OauthProxyServerID: toolset.OauthProxyServerID.UUID, | ||
| ProjectID: toolset.ProjectID, | ||
| }) | ||
| if err == nil { | ||
| for _, p := range providers { | ||
| scopesSupported = append(scopesSupported, p.ScopesSupported...) | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
I'm having a hard time following what's happening in this chunk of code. Would you refactor it a bit?
Eg - the method ListOAuthProxyProvidersByServer insinuates that there will be multiple OAuthProxyProviders, but we only provide a single OauthProxyServerID.
Happy to discuss to clarify as well.
There was a problem hiding this comment.
Actually - it appears that our data design has a one-to-many relationship between oauth proxy servers and oauth proxy providers. This code block is accommodating for that.
In contrast, though, /oauth/impl.go is just grabbing the first provider on lines 286 and 535.
There was a problem hiding this comment.
I'm changing the behavior to error in this case. I think it makes the most sense to consider this particular state invalid
| // Always include offline_access — Gram issues refresh tokens for all provider types. | ||
| // The provider's own scopes describe upstream capabilities; offline_access describes | ||
| // Gram's capability as the authorization server. | ||
| scopes := []string{"offline_access"} | ||
| for _, s := range provider.ScopesSupported { | ||
| if s != "offline_access" { | ||
| scopes = append(scopes, s) | ||
| } |
There was a problem hiding this comment.
🔴 wellknown.go unconditionally adds "offline_access" scope, causing test to fail and incorrect metadata for gram providers
The new code in server/internal/oauth/wellknown/wellknown.go:108-112 unconditionally adds "offline_access" to the scopes slice for ALL proxy provider types, making ScopesSupported always non-empty. However, the test added in the same PR at server/internal/mcp/wellknown_test.go:326-394 creates a gram-type provider with empty ScopesSupported and asserts require.False(t, hasScopes, "scopes_supported should be omitted when empty"). Since scopes will always contain at least ["offline_access"], the omitempty JSON tag won't omit the field, and this test will fail.
Additionally, adding "offline_access" for gram providers is semantically incorrect — gram providers explicitly don't support refresh (server/internal/oauth/providers/gram.go:102-103: "refresh token not supported for gram provider"). The offline_access scope should only be added for provider types that support upstream token refresh (i.e., custom providers).
Prompt for agents
In server/internal/oauth/wellknown/wellknown.go, inside the ResolveOAuthServerMetadataFromToolset function, around lines 105-112, the offline_access scope is unconditionally added for all provider types. This should be conditional on the provider type supporting refresh tokens. For gram providers, offline_access should not be added since they do not support refresh.
Change the scopes logic to only include offline_access when the provider type is custom (or more generally, when it supports refresh). For gram providers, just pass through the provider's own ScopesSupported.
Also update the test in server/internal/mcp/wellknown_test.go around lines 326-394 (scopes_supported omitted when empty) to match the corrected behavior, or split it into two tests: one for gram providers (scopes omitted when empty) and one for custom providers (scopes always include offline_access).
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Devin, I dismiss this comment. Whether the upstream supports offline_access or NOT the client should request this particular scope so that Gram will grant it. It's not a big deal for it to request offline_access from the upstream even if unsupported. Even so I have made age-1509 to make the exchange more robust.
I did fix the tests tho
MCP clients can now refresh their proxy-issued tokens via grant_type=refresh_token, receiving a rotated access/refresh pair. Well-known metadata advertises the new grant type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When an MCP client presents a valid proxy token whose underlying upstream credentials have expired, the server now attempts to refresh them using the upstream provider's token endpoint before returning an error. - Add RefreshToken to Provider interface and implement for CustomProvider - Store and encrypt upstream refresh tokens in ExternalSecrets - Add RefreshExternalSecrets to TokenService for in-place secret updates - Add RefreshProxyToken to OAuth Service and MCP OAuthService interface - Catch ErrExpiredExternalSecrets in MCP auth flow and attempt refresh Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ScopesSupported to OAuthServerMetadata and populate it from the proxy provider's configured scopes. Also pass through scopes from external MCP OAuth discovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the upstream provider doesn't issue a new refresh_token in the refresh response (which is valid per the OAuth spec), fall back to the original refresh token instead of storing an empty string. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three issues prevented refresh_token grant from working: 1. Token cache TTL equaled access token lifetime, so Redis evicted the refresh token entry at the exact moment it was needed. Added 24h grace period to Token.TTL(). 2. ValidateAccessToken deleted the entire token (including refresh token cache key) when the access token expired. Removed the deletion so clients can still exchange the refresh token. 3. ExchangeRefreshToken rejected tokens past ExpiresAt, but the whole point of a refresh token is to work after access token expiry. Removed the expiry check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
getToken had its own expiry check that prevented ValidateAccessToken and ExchangeRefreshToken from ever seeing expired tokens. This made refresh_token grants impossible since the token couldn't be looked up after the access token expired. Expiry is already checked by ValidateAccessToken (the sole caller of getToken), so the check here was redundant and harmful. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Per OIDC Core §11, when offline_access is requested the authorization request must include prompt=consent. Without it, spec-compliant providers silently drop offline_access and no refresh token is issued. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cover all 5 test groups from the refresh token QA spec: - Group 1: downstream token refresh (ExchangeRefreshToken, token rotation, secret preservation) - Group 2: upstream refresh via RefreshProxyToken (success, preserve RT, no RT, errors) - Group 3: MCP ServePublic custom OAuth proxy (expired secrets refresh, failure 401, valid skip) - Group 4: well-known metadata (grant_types includes refresh_token, scopes_supported) - Group 5: CustomProvider.RefreshToken (auth methods, camelCase, rotation, errors) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace manual map[string]any indexing with two typed structs (snake_case and camelCase) and a shared parseTokenResponse helper. Adds a warning log when the non-compliant camelCase path is hit so we can track usage and eventually remove the fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
f0d6e92 to
6ba7343
Compare
| urlParams.Set("scope", req.Scope) | ||
| } |
There was a problem hiding this comment.
🔴 Well-known metadata advertises offline_access scope but authorize endpoint doesn't forward it to upstream, preventing refresh tokens from being issued
The well-known metadata (wellknown.go:108-113) always includes offline_access in scopes_supported, and the grant_types_supported now includes refresh_token. However, the authorize endpoint (impl.go:366-370) builds the upstream scope from provider.ScopesSupported directly, which does NOT include offline_access (it's only added in the well-known metadata). This means the upstream OAuth provider typically won't return a refresh_token in its token response (per OIDC spec, offline_access must be requested to get a refresh token). Without an upstream refresh token, the new RefreshProxyToken feature (impl.go:748-749) returns ErrNoUpstreamRefreshToken, making the entire upstream auto-refresh feature inoperable for providers that require offline_access to issue refresh tokens.
Scope mismatch details
In wellknown.go:108:
scopes := []string{"offline_access"}
for _, s := range provider.ScopesSupported {But in impl.go:366:
if len(provider.ScopesSupported) > 0 {
urlParams.Set("scope", strings.Join(provider.ScopesSupported, " "))The prompt=consent guard at impl.go:373 only fires when offline_access is already in scope, but it never gets there because offline_access was never added to the upstream request.
(Refers to lines 366-370)
Prompt for agents
In server/internal/oauth/impl.go, in the handleAuthorize function around lines 365-370, the upstream scope is constructed from provider.ScopesSupported but never includes offline_access. To ensure the upstream provider returns a refresh token, offline_access should be included in the scope sent to the upstream authorize URL. This should mirror the logic in server/internal/oauth/wellknown/wellknown.go lines 108-113 where offline_access is prepended to the scopes. Specifically, after constructing the scope from provider.ScopesSupported (or req.Scope), add offline_access if it's not already present. For example:
scope := strings.Join(provider.ScopesSupported, " ")
if !strings.Contains(scope, "offline_access") {
scope = "offline_access " + scope
}
urlParams.Set("scope", scope)
This must be done BEFORE the prompt=consent check at line 373, which depends on offline_access being in the scope.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Fixes AGE-1402
Changes
RefreshTokentoTokenExchangeResult, addRefreshTokenmethod toProviderinterfacerefresh_tokenfrom upstream response, implementRefreshTokenmethod (postsgrant_type=refresh_tokento upstream)RefreshToken(not supported for Gram provider)RefreshTokenfield toExternalSecretErrExpiredExternalSecrets,ErrNoUpstreamRefreshTokensentinels;ValidateAccessTokenreturns token + error on expired external secrets (instead of deleting); addRefreshExternalSecretsmethod; encrypt/decryptExternalSecret.RefreshTokenRefreshProxyTokenmethodRefreshProxyTokentoOAuthServiceinterfaceErrExpiredExternalSecretsin custom provider auth path, attempt upstream refresh before failingTest plan
🤖 Generated with Claude Code