Skip to content

SSO token forwarding fails after idle period: stale ID token with zero ExpiresAt never evicted from proxy store #549

@gusevda

Description

@gusevda

Summary

After an idle period of >30 minutes with no incoming requests, muster forwards expired JWT ID tokens to downstream forwardToken: true MCP servers, resulting in 401 Unauthorized errors. Token refresh from the client (Backstage) succeeds at the muster level, but the stale ID token continues to be forwarded.

Root Cause

storeIDTokenForSSO() in internal/aggregator/auth_resource.go:226 stores tokens with only the IDToken field populated — no AccessToken, no ExpiresIn, no ExpiresAt:

oh.StoreToken(familyID, userID, musterIssuer, &api.OAuthToken{IDToken: idToken})

When TokenStore.Store() calls SetExpiresAtFromExpiresIn(), it's a no-op (ExpiresIn is 0), leaving ExpiresAt as zero. IsExpiredWithMargin() in pkg/oauth/types.go:98 treats zero as "never expires":

if t.ExpiresAt.IsZero() {
    return false // Tokens without expiration don't expire
}

This means GetByIssuer() returns these stale tokens indefinitely, regardless of the embedded JWT's actual exp claim.

Mechanism

  1. User logs in → SessionCreationHandler fires → storeIDTokenForSSO stores ID-only token with zero ExpiresAt
  2. Proactive refresh (triggered by incoming requests) keeps the upstream token fresh and calls TokenRefreshHandlerstoreIDTokenForSSO updates the proxy store
  3. After >30 min idle, the upstream Dex token expires and no proactive refresh runs (it only triggers on incoming requests)
  4. Next request arrives → client-side refresh succeeds (muster rotates its own tokens), but the upstream Dex token is NOT refreshed → TokenRefreshHandler never fires
  5. getIDTokenForForwarding() reads from the proxy store → gets the stale zero-ExpiresAt token with expired JWT inside
  6. Downstream MCP server validates the JWT exp claim → rejects with 401 invalid_token

Additional Impact

Background SSE listener connections (listen to server forever) enter an infinite 1-second retry loop with stale tokens, causing:

  • Persistent log flooding: ERROR: failed to listen to server. retry in 1 second: 401 invalid_token
  • Unnecessary load on downstream servers

Reproduction

Unit test (instant)

Branch test/stale-id-token-forwarding contains TestTokenStore_IDOnlyTokenWithExpiredJWT_NeverEvicted in internal/oauth/token_store_test.go that demonstrates the core store behavior.

go test ./internal/oauth/ -run TestTokenStore_IDOnlyTokenWithExpiredJWT_NeverEvicted -v

End-to-end (35 minutes)

  1. Login from Backstage to muster
  2. Make a request targeting a forwardToken: true server (e.g., gazelle-mcp-kubernetes) — succeeds
  3. Wait >30 minutes without making any requests
  4. Make the same request — fails with 401 Unauthorized

Verified on

  • Local dev instance (token forwarding to graveler-mcp-kubernetes)
  • Deployed instance on gazelle cluster (token forwarding to gazelle-mcp-kubernetes)
    • Login at 06:58 UTC, successful call at 08:28 UTC, failed call at 09:51 UTC after ~80 min idle gap
    • Log: CallTool failed for x_gazelle-mcp-kubernetes_capi_list_clusters: failed to initialize on-demand client for gazelle-mcp-kubernetes: authentication required: server returned 401 Unauthorized

Key Files

File Relevance
internal/aggregator/auth_resource.go:226-237 storeIDTokenForSSO stores ID-only token with no expiry
pkg/oauth/types.go:96-102 IsExpiredWithMargin returns false for zero ExpiresAt
internal/oauth/token_store.go:130-146 GetByIssuer returns stale zero-expiry tokens
internal/aggregator/connection_helper.go:242-262 getIDTokenForForwarding reads from proxy store
internal/aggregator/connection_helper.go:890-946 makeTokenForwardingHeaderFunc forwards stale token
internal/aggregator/server.go:1419-1433 TokenRefreshHandler only fires on proactive refresh

Possible Fixes

(a) Set ExpiresAt on ID-only tokens based on the JWT exp claim in storeIDTokenForSSO

(b) Trigger an upstream Dex refresh when the client refreshes and the stored upstream token is expired

(c) Check the JWT exp claim in getIDTokenForForwarding before forwarding, and trigger re-authentication if expired

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions