Skip to content

fix(mcp): cache result of discoverServerTools to prevent post-OAuth refetch storm#4701

Merged
waleedlatif1 merged 2 commits into
stagingfrom
fix/mcp-discover-server-tools-cache
May 21, 2026
Merged

fix(mcp): cache result of discoverServerTools to prevent post-OAuth refetch storm#4701
waleedlatif1 merged 2 commits into
stagingfrom
fix/mcp-discover-server-tools-cache

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

Summary

  • OAuth callback was calling discoverServerTools to populate tools immediately after auth, but the result was never written to the cache — so the UI's immediate refetch missed cache and tried to live-fetch again
  • On servers with slow upstreams (PlanetScale taking 60s to respond to tools/list from prod ECS), this left the UI in permanent "Loading…" forever, refetching every few seconds and timing out each time
  • Fix: have discoverServerTools write its successful result to the per-server cache key, matching what discoverTools already does in its fetched branch
  • Removed the now-redundant clearCache(workspaceId) from the OAuth callback — it was wiping cached tools for OTHER servers in the workspace unnecessarily; the new cache write replaces just the one server we care about

Type of Change

  • Bug fix

Testing

Tested manually by reproducing prod symptom on staging — confirmed OAuth flow leaves UI in "Loading…" indefinitely before the fix, and renders the tools list instantly after.
Added unit test: discoverServerTools primes per-server cache so follow-up discoverTools hits cache (no live re-fetch).
All 286 MCP tests pass.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 21, 2026 5:55pm

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 21, 2026

PR Summary

Medium Risk
Moderate risk: changes tool discovery side effects (cache writes and DB status updates) and introduces a new tools/list timeout, which could affect perceived server availability if mis-tuned.

Overview
Prevents the post-OAuth UI refetch loop by having discoverServerTools write successful tools/list results into the per-server cache (and update server status/tool count) so the next discoverTools call hits cache instead of re-fetching live.

Removes the OAuth callback’s workspace-wide cache clear, avoiding unintended invalidation of other servers’ cached tools, and caps McpClient.listTools with a dedicated LIST_TOOLS_TIMEOUT_MS to keep slow upstream metadata calls from hanging the UI. Adds unit coverage ensuring discoverServerTools primes cache, plus minor test-mock updates to support .limit() and the new timeout constant.

Reviewed by Cursor Bugbot for commit 4e6210c. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 21, 2026

Greptile Summary

This PR fixes a post-OAuth "Loading…" loop on MCP servers: discoverServerTools was live-fetching tools after auth but never writing to the cache, causing the UI's immediate follow-up discoverTools call to miss the cache and re-fetch — timing out repeatedly on slow upstreams. It also adds a 30 s hard timeout to listTools to prevent indefinite hangs.

  • Cache priming: discoverServerTools now writes its result to the per-server cache key and calls updateServerStatus (marking the row connected with fresh toolCount/lastToolsRefresh), matching the behaviour already present in discoverTools.
  • Targeted cache invalidation: The broad clearCache(workspaceId) in the OAuth callback (which wiped every server's cached tools) is replaced by the single-server cache write inside discoverServerTools.
  • listTools timeout: A new LIST_TOOLS_TIMEOUT_MS: 30_000 constant caps the MCP SDK listTools call so a slow upstream cannot stall the entire discovery loop indefinitely.

Confidence Score: 4/5

Safe to merge; the core OAuth cache-miss fix is correct and well-tested, with one minor gap in error observability.

The fix is well-scoped and the new test validates end-to-end cache priming. The one remaining gap — no warning logged when updateServerStatus rejects inside allSettled — means a transient DB failure would leave the server row stale and produce no log signal, which could complicate future incident diagnosis.

apps/sim/lib/mcp/service.ts — the updateServerStatus arm in the new Promise.allSettled block has no .catch() logger, unlike the cache-write arm beside it.

Important Files Changed

Filename Overview
apps/sim/lib/mcp/service.ts Core fix: discoverServerTools now primes the per-server cache and calls updateServerStatus after a successful live fetch, resolving the post-OAuth "Loading…" loop and the previously-flagged missing DB status update.
apps/sim/lib/mcp/client.ts Adds a per-request LIST_TOOLS_TIMEOUT_MS cap (30 s) to listTools so a slow upstream can no longer block the UI indefinitely.
apps/sim/lib/mcp/utils.ts Adds LIST_TOOLS_TIMEOUT_MS: 30_000 to MCP_CLIENT_CONSTANTS for use in the new per-call timeout on listTools.
apps/sim/app/api/mcp/oauth/callback/route.ts Removes the broad clearCache(workspaceId) call (which wiped tools for all servers) and relies on discoverServerTools to write only the authenticated server's cache entry.
apps/sim/lib/mcp/service.test.ts Adds a regression test verifying that discoverServerTools primes the per-server cache so a follow-up discoverTools doesn't trigger a live re-fetch; also fixes the mock to support .limit() chains.
apps/sim/lib/mcp/client.test.ts Adds DEFAULT_EXECUTION_TIMEOUT_MS to the execution-limits mock so the new constant import in client.ts doesn't break tests.

Sequence Diagram

sequenceDiagram
    participant UI
    participant OAuthCallback as OAuth Callback
    participant Service as McpService
    participant Cache as Per-Server Cache
    participant DB

    UI->>OAuthCallback: "GET /api/mcp/oauth/callback?code=..."
    OAuthCallback->>DB: Exchange code to store token
    OAuthCallback->>Service: discoverServerTools(userId, serverId, workspaceId)
    Service->>DB: getServerConfig(serverId)
    Service->>Service: resolveConfigEnvVars + createClient
    Service->>Service: client.listTools() timeout 30s
    Service->>Cache: set(serverCacheKey, tools, 5min)
    Service->>DB: updateServerStatus(connected, toolCount)
    Service-->>OAuthCallback: tools[]
    OAuthCallback-->>UI: htmlClose Connected

    UI->>Service: discoverTools(userId, workspaceId)
    Service->>Cache: get(serverCacheKey)
    Cache-->>Service: tools[] cache HIT
    Service-->>UI: tools[] no live re-fetch
Loading

Reviews (2): Last reviewed commit: "improvement(mcp): update server status f..." | Re-trigger Greptile

Comment thread apps/sim/lib/mcp/service.ts Outdated
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 4e6210c. Configure here.

@waleedlatif1 waleedlatif1 merged commit 89695de into staging May 21, 2026
14 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/mcp-discover-server-tools-cache branch May 21, 2026 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant