Summary
After an idle period of >30 minutes with no incoming requests, muster forwards expired JWT ID tokens to downstream forwardToken: true MCP servers, resulting in 401 Unauthorized errors. Token refresh from the client (Backstage) succeeds at the muster level, but the stale ID token continues to be forwarded.
Root Cause
storeIDTokenForSSO() in internal/aggregator/auth_resource.go:226 stores tokens with only the IDToken field populated — no AccessToken, no ExpiresIn, no ExpiresAt:
oh.StoreToken(familyID, userID, musterIssuer, &api.OAuthToken{IDToken: idToken})
When TokenStore.Store() calls SetExpiresAtFromExpiresIn(), it's a no-op (ExpiresIn is 0), leaving ExpiresAt as zero. IsExpiredWithMargin() in pkg/oauth/types.go:98 treats zero as "never expires":
if t.ExpiresAt.IsZero() {
return false // Tokens without expiration don't expire
}
This means GetByIssuer() returns these stale tokens indefinitely, regardless of the embedded JWT's actual exp claim.
Mechanism
- User logs in →
SessionCreationHandler fires → storeIDTokenForSSO stores ID-only token with zero ExpiresAt
- Proactive refresh (triggered by incoming requests) keeps the upstream token fresh and calls
TokenRefreshHandler → storeIDTokenForSSO updates the proxy store
- After >30 min idle, the upstream Dex token expires and no proactive refresh runs (it only triggers on incoming requests)
- Next request arrives → client-side refresh succeeds (muster rotates its own tokens), but the upstream Dex token is NOT refreshed →
TokenRefreshHandler never fires
getIDTokenForForwarding() reads from the proxy store → gets the stale zero-ExpiresAt token with expired JWT inside
- Downstream MCP server validates the JWT
exp claim → rejects with 401 invalid_token
Additional Impact
Background SSE listener connections (listen to server forever) enter an infinite 1-second retry loop with stale tokens, causing:
- Persistent log flooding:
ERROR: failed to listen to server. retry in 1 second: 401 invalid_token
- Unnecessary load on downstream servers
Reproduction
Unit test (instant)
Branch test/stale-id-token-forwarding contains TestTokenStore_IDOnlyTokenWithExpiredJWT_NeverEvicted in internal/oauth/token_store_test.go that demonstrates the core store behavior.
go test ./internal/oauth/ -run TestTokenStore_IDOnlyTokenWithExpiredJWT_NeverEvicted -v
End-to-end (35 minutes)
- Login from Backstage to muster
- Make a request targeting a
forwardToken: true server (e.g., gazelle-mcp-kubernetes) — succeeds
- Wait >30 minutes without making any requests
- Make the same request — fails with
401 Unauthorized
Verified on
- Local dev instance (token forwarding to graveler-mcp-kubernetes)
- Deployed instance on gazelle cluster (token forwarding to gazelle-mcp-kubernetes)
- Login at 06:58 UTC, successful call at 08:28 UTC, failed call at 09:51 UTC after ~80 min idle gap
- Log:
CallTool failed for x_gazelle-mcp-kubernetes_capi_list_clusters: failed to initialize on-demand client for gazelle-mcp-kubernetes: authentication required: server returned 401 Unauthorized
Key Files
| File |
Relevance |
internal/aggregator/auth_resource.go:226-237 |
storeIDTokenForSSO stores ID-only token with no expiry |
pkg/oauth/types.go:96-102 |
IsExpiredWithMargin returns false for zero ExpiresAt |
internal/oauth/token_store.go:130-146 |
GetByIssuer returns stale zero-expiry tokens |
internal/aggregator/connection_helper.go:242-262 |
getIDTokenForForwarding reads from proxy store |
internal/aggregator/connection_helper.go:890-946 |
makeTokenForwardingHeaderFunc forwards stale token |
internal/aggregator/server.go:1419-1433 |
TokenRefreshHandler only fires on proactive refresh |
Possible Fixes
(a) Set ExpiresAt on ID-only tokens based on the JWT exp claim in storeIDTokenForSSO
(b) Trigger an upstream Dex refresh when the client refreshes and the stored upstream token is expired
(c) Check the JWT exp claim in getIDTokenForForwarding before forwarding, and trigger re-authentication if expired
Summary
After an idle period of >30 minutes with no incoming requests, muster forwards expired JWT ID tokens to downstream
forwardToken: trueMCP servers, resulting in401 Unauthorizederrors. Token refresh from the client (Backstage) succeeds at the muster level, but the stale ID token continues to be forwarded.Root Cause
storeIDTokenForSSO()ininternal/aggregator/auth_resource.go:226stores tokens with only theIDTokenfield populated — noAccessToken, noExpiresIn, noExpiresAt:When
TokenStore.Store()callsSetExpiresAtFromExpiresIn(), it's a no-op (ExpiresIn is 0), leavingExpiresAtas zero.IsExpiredWithMargin()inpkg/oauth/types.go:98treats zero as "never expires":This means
GetByIssuer()returns these stale tokens indefinitely, regardless of the embedded JWT's actualexpclaim.Mechanism
SessionCreationHandlerfires →storeIDTokenForSSOstores ID-only token with zero ExpiresAtTokenRefreshHandler→storeIDTokenForSSOupdates the proxy storeTokenRefreshHandlernever firesgetIDTokenForForwarding()reads from the proxy store → gets the stale zero-ExpiresAt token with expired JWT insideexpclaim → rejects with401 invalid_tokenAdditional Impact
Background SSE listener connections (
listen to server forever) enter an infinite 1-second retry loop with stale tokens, causing:ERROR: failed to listen to server. retry in 1 second: 401 invalid_tokenReproduction
Unit test (instant)
Branch
test/stale-id-token-forwardingcontainsTestTokenStore_IDOnlyTokenWithExpiredJWT_NeverEvictedininternal/oauth/token_store_test.gothat demonstrates the core store behavior.End-to-end (35 minutes)
forwardToken: trueserver (e.g.,gazelle-mcp-kubernetes) — succeeds401 UnauthorizedVerified on
CallTool failed for x_gazelle-mcp-kubernetes_capi_list_clusters: failed to initialize on-demand client for gazelle-mcp-kubernetes: authentication required: server returned 401 UnauthorizedKey Files
internal/aggregator/auth_resource.go:226-237storeIDTokenForSSOstores ID-only token with no expirypkg/oauth/types.go:96-102IsExpiredWithMarginreturns false for zero ExpiresAtinternal/oauth/token_store.go:130-146GetByIssuerreturns stale zero-expiry tokensinternal/aggregator/connection_helper.go:242-262getIDTokenForForwardingreads from proxy storeinternal/aggregator/connection_helper.go:890-946makeTokenForwardingHeaderFuncforwards stale tokeninternal/aggregator/server.go:1419-1433TokenRefreshHandleronly fires on proactive refreshPossible Fixes
(a) Set
ExpiresAton ID-only tokens based on the JWTexpclaim instoreIDTokenForSSO(b) Trigger an upstream Dex refresh when the client refreshes and the stored upstream token is expired
(c) Check the JWT
expclaim ingetIDTokenForForwardingbefore forwarding, and trigger re-authentication if expired