Describe the bug
When the access token for an OAuth-protected MCP server expires and multiple
tool calls are in flight (or fired concurrently), Copilot CLI's MCP client
fans out multiple parallel refresh-token requests, each presenting the same
parent refresh token. The first request rotates the chain successfully; the
remaining requests then present the now-rotated parent RT, which an OAuth
2.1-conformant server treats as token replay and uses to revoke the entire
refresh chain (RFC 6749 §10.4 / OAuth 2.1 §6.4 — refresh token rotation with
reuse detection).
Net effect: every access-token expiry followed by concurrent tool calls
results in a dead chain. The user has to remove and re-add the MCP server to
recover.
Affected version
GitHub Copilot CLI 1.0.51
Steps to reproduce the behavior
- Add an OAuth-protected MCP server that implements strict refresh-token
reuse detection (an HTTP-transport server with DCR + PKCE + RFC 8707
resource indicators is sufficient).
- Authenticate via the browser flow.
- Make at least one tool call so the MCP client is fully connected.
- Wait for the access token to expire (≥ 5 minutes in our setup).
- Fire two or more tool calls (or trigger any agent action that produces
concurrent tool calls).
Expected behavior
The MCP client should coalesce the concurrent refresh attempts: a single
refresh request per (client_id, sub) is in flight at any time, and other
workers either wait for that refresh to complete or use a shared
promise/future to receive the rotated tokens. This is the standard pattern
used by widely-deployed OAuth client libraries (e.g.,
golang.org/x/oauth2's ReuseTokenSource, MSAL's confidential_client
token cache, AppAuth's AppAuthState).
Additional context
Actual behavior
Each worker independently fires its own POST /token with grant_type= refresh_token and the cached refresh token. Three concurrent tool calls
produce three concurrent refresh requests:
- Worker A presents
RT_0 → server rotates, returns RT_1 + new access
token. Worker A's subsequent backend call succeeds … briefly.
- Worker B presents
RT_0 (its cache hasn't observed A's rotation yet) →
server detects reuse → revokes the chain → returns
{"error":"invalid_grant","error_description":"refresh chain has been revoked"}.
- Worker A's subsequent backend call now returns 401 because the chain was
revoked between its refresh and its tool-call HTTP request.
- Worker C presents
RT_0 (or RT_1, depending on timing) → chain is
already dead → same invalid_grant.
All three tool calls fail. The MCP session is unrecoverable without a
/authorize redo, which Copilot does not initiate automatically — the user
sees a "reauthenticate" prompt or has to remove and re-add the server.
Evidence
Three concurrent POST /token requests arrived at the server within ~1 ms
of each other. Timestamps (with millisecond precision) and event types from
the server's audit log:
10:18:27.215 POST /proxy/... 401 (access token expired) [x3, concurrent]
10:18:27.217–.220 GET /.well-known/... 200 (discovery) [x3 PRM + x3 AS]
10:18:27.222 POST /token 200 token_refreshed (rotation succeeded)
10:18:27.222 POST /token 400 invalid_grant (refresh_reuse_detected → chain killed)
10:18:27.223 POST /token 400 invalid_grant (refresh_chain_dead)
10:18:27.227 POST /proxy/... 401 refresh chain revoked (backend call after kill)
10:18:27.238 POST /token 400 invalid_grant (chain still dead)
Copilot CLI's own MCP-client errors landed within 19 ms of each other on
the client side:
16:18:27.228 ERROR MCP client for requestlist-test errored Error: Streamable HTTP error: Server returned 401 after successful authentication
16:18:27.239 ERROR MCP client for requestlist-test errored Gue: refresh chain has been revoked
16:18:27.247 ERROR MCP client for requestlist-test errored Gue: refresh chain has been revoked
(The same pattern reproduced on an earlier run, with two Streamable HTTP error … 401 after successful authentication and one refresh chain has been revoked. The distribution depends on which worker's request committed
the rotation first.)
Why this is a conformance gap
The MCP authorization specification (revision 2025-11-25) defers to OAuth
2.1 for the refresh-token grant. OAuth 2.1 §6.4 makes refresh-token
rotation with reuse detection the recommended default for public clients
(which Copilot CLI is — it registers via DCR with
token_endpoint_auth_method=none and PKCE). Servers that follow the
recommendation will detect Copilot's stale-RT presentation as token replay
and revoke the chain — that's the documented and intended behavior of the
reuse-detection mechanism. The behavior we are reporting is the client
side: a public OAuth client should not present the same parent RT from two
or more workers concurrently.
This is not a server-side bug we can patch around without weakening the
reuse-detection invariant in a way OAuth 2.1 §6.4 explicitly cautions
against. The two server-side accommodations we considered —
- A short "grace window" where presenting the just-rotated parent RT
re-mints an access token without advancing the chain, and
- Returning the cached child rotation to late presenters of the parent RT
for some window after rotation,
— both create a window in which a stolen refresh token can be successfully
exchanged in parallel with the legitimate client without the chain dying,
which is precisely the attack reuse-detection exists to defeat.
Suggested client-side fix
Serialize the refresh-token grant within the MCP client's auth manager:
- Maintain a per-server (or per-
(client_id, sub)) singleflight slot for
the refresh-token grant.
- When N workers concurrently detect 401 from the backend, exactly one
fires POST /token; the others wait on the same future and receive the
rotated tokens once that single call returns.
- Persist the new refresh token to the shared cache before any waiting
workers wake up, so that they retry the backend call with the rotated
access token, not their own follow-up refresh.
Describe the bug
When the access token for an OAuth-protected MCP server expires and multiple
tool calls are in flight (or fired concurrently), Copilot CLI's MCP client
fans out multiple parallel refresh-token requests, each presenting the same
parent refresh token. The first request rotates the chain successfully; the
remaining requests then present the now-rotated parent RT, which an OAuth
2.1-conformant server treats as token replay and uses to revoke the entire
refresh chain (RFC 6749 §10.4 / OAuth 2.1 §6.4 — refresh token rotation with
reuse detection).
Net effect: every access-token expiry followed by concurrent tool calls
results in a dead chain. The user has to remove and re-add the MCP server to
recover.
Affected version
GitHub Copilot CLI 1.0.51
Steps to reproduce the behavior
reuse detection (an HTTP-transport server with DCR + PKCE + RFC 8707
resource indicators is sufficient).
concurrent tool calls).
Expected behavior
The MCP client should coalesce the concurrent refresh attempts: a single
refresh request per
(client_id, sub)is in flight at any time, and otherworkers either wait for that refresh to complete or use a shared
promise/future to receive the rotated tokens. This is the standard pattern
used by widely-deployed OAuth client libraries (e.g.,
golang.org/x/oauth2'sReuseTokenSource, MSAL'sconfidential_clienttoken cache, AppAuth's
AppAuthState).Additional context
Actual behavior
Each worker independently fires its own
POST /tokenwithgrant_type= refresh_tokenand the cached refresh token. Three concurrent tool callsproduce three concurrent refresh requests:
RT_0→ server rotates, returnsRT_1+ new accesstoken. Worker A's subsequent backend call succeeds … briefly.
RT_0(its cache hasn't observed A's rotation yet) →server detects reuse → revokes the chain → returns
{"error":"invalid_grant","error_description":"refresh chain has been revoked"}.revoked between its refresh and its tool-call HTTP request.
RT_0(orRT_1, depending on timing) → chain isalready dead → same
invalid_grant.All three tool calls fail. The MCP session is unrecoverable without a
/authorizeredo, which Copilot does not initiate automatically — the usersees a "reauthenticate" prompt or has to remove and re-add the server.
Evidence
Three concurrent
POST /tokenrequests arrived at the server within ~1 msof each other. Timestamps (with millisecond precision) and event types from
the server's audit log:
Copilot CLI's own MCP-client errors landed within 19 ms of each other on
the client side:
(The same pattern reproduced on an earlier run, with two
Streamable HTTP error … 401 after successful authenticationand onerefresh chain has been revoked. The distribution depends on which worker's request committedthe rotation first.)
Why this is a conformance gap
The MCP authorization specification (revision 2025-11-25) defers to OAuth
2.1 for the refresh-token grant. OAuth 2.1 §6.4 makes refresh-token
rotation with reuse detection the recommended default for public clients
(which Copilot CLI is — it registers via DCR with
token_endpoint_auth_method=noneand PKCE). Servers that follow therecommendation will detect Copilot's stale-RT presentation as token replay
and revoke the chain — that's the documented and intended behavior of the
reuse-detection mechanism. The behavior we are reporting is the client
side: a public OAuth client should not present the same parent RT from two
or more workers concurrently.
This is not a server-side bug we can patch around without weakening the
reuse-detection invariant in a way OAuth 2.1 §6.4 explicitly cautions
against. The two server-side accommodations we considered —
re-mints an access token without advancing the chain, and
for some window after rotation,
— both create a window in which a stolen refresh token can be successfully
exchanged in parallel with the legitimate client without the chain dying,
which is precisely the attack reuse-detection exists to defeat.
Suggested client-side fix
Serialize the refresh-token grant within the MCP client's auth manager:
(client_id, sub)) singleflight slot forthe refresh-token grant.
fires
POST /token; the others wait on the same future and receive therotated tokens once that single call returns.
workers wake up, so that they retry the backend call with the rotated
access token, not their own follow-up refresh.