Skip to content

fix(crm-agent): mcp_auth_proxy refresh based on token exp, not wall-clock#74

Merged
carvychen merged 1 commit into
mainfrom
fix/proxy-token-expiry-detect
May 10, 2026
Merged

fix(crm-agent): mcp_auth_proxy refresh based on token exp, not wall-clock#74
carvychen merged 1 commit into
mainfrom
fix/proxy-token-expiry-detect

Conversation

@carvychen
Copy link
Copy Markdown
Owner

Summary

User caught a 401 in Claude Code through the proxy:

```
你的访问令牌已过期(过期时间 04:18:55 UTC,当前 04:26:01 UTC)
```

Token had expired ~7 min ago, but the proxy was still serving it.

Root cause

`mcp_auth_proxy.py` cached the token on first fetch and refreshed when "more than 45 min has passed since we fetched". That's a proxy for "token is fresh" — it breaks when `az account get-access-token` returns its own near-expiry cached token. Our 45-min countdown then starts on a token that's already aged 50+ min, and we serve it for the remaining ~10 min before it expires; the next ~35 min we keep serving an expired token because our countdown hasn't elapsed.

Fix

Refresh decision now based on the token's actual `exp` claim, not on when we fetched:

```python
if _token_cache and _token_exp(_token_cache) - time.time() > 60:
return _token_cache

else: re-fetch from az

```

Plus a safeguard: if `az` itself returns a near-expiry token (MSAL refresh wedged), fail loud with `az logout && az login` guidance instead of silently serving a bad token.

Test plan

  • Syntax clean
  • Parser correctly extracts exp from a real Foundry-audience JWT — `exp - now` matches `az account get-access-token --query expiresOn` to within sub-second
  • Manual: leave proxy running >1h, send a request after the original token's exp time, observe fresh fetch + 200 (instead of 401)

🤖 Generated with Claude Code

…lock

User caught a 401: token expired at 04:18:55 UTC but proxy was still serving
it at 04:26:01 UTC. Root cause: refresh decision was based on "we last
fetched 45 min ago" instead of "token's exp claim says <60s remaining".

The bug surfaces when `az account get-access-token` returns its OWN cached
near-expiry token. Our 45-min countdown starts fresh from the moment we
received it, but the token itself was already aged. We then serve an
expired token for up to its remaining lifetime (which can be seconds).

Fix:
- Decode the JWT `exp` claim each request.
- Refresh when `exp - now < 60s` (was: refresh when "fetched > 45 min ago").
- Catch the rarer case where `az` itself returns an already-near-expiry
  token (MSAL refresh wedged) — fail loud with `az logout && az login`
  guidance instead of silently serving a bad token.

`time.time() < refresh_after` was a sin: clock-since-last-fetch is a proxy
for token freshness, but `az` breaks the assumption when it serves a
cached token directly.

Smoke: parser correctly extracts exp from a real Foundry-audience JWT
(3857s remaining, matches `az account get-access-token --query expiresOn`).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@carvychen carvychen merged commit 202af1d into main May 10, 2026
3 checks passed
@carvychen carvychen deleted the fix/proxy-token-expiry-detect branch May 10, 2026 04:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant