[Feature][Java] Move MCP resource discovery to runtime#639
Conversation
|
Thanks for your contribution, @yanand0909. Due to the upcoming traditional Chinese holiday, my review may be slightly delayed. |
weiqingy
left a comment
There was a problem hiding this comment.
Thanks for taking this on — moving discovery out of plan construction is a real cleanup. Two cross-cutting questions before I leave inline notes:
-
MCPServer lifecycle when discovery yields nothing.
discoverJavaMCPResourcesinstantiates the server and callslistTools()/supportsPrompts(), both of which lazily initialize anMcpSyncClientover HTTP (MCPServer.getClient()atMCPServer.java:304-309, lazy create atMCPServer.java:316-343). The only things put incacheare the discovered tools and prompts; the server itself isn't cached. Shutdown then relies onMCPTool.close()/MCPPrompt.close()calling back intomcpServer.close()— which works when at least one tool or prompt was discovered. IflistTools()returns empty AND prompts are unsupported (or also empty), nothing in the cache references the server, somcpServer.close()never runs and the initializedMcpSyncClient(with its HTTP transport) leaks. An empty-tools server is unusual but possible — a prompts-only server, a misconfigured server, a server going through gradual rollout. Two shapes that would close this gap, in case either helps: (a) put the server itself in the cache (cache.put(serverName, MCP_SERVER, mcpServer)afterprovide()), or (b) track discovered servers in a list owned byActionExecutionOperatorand close them in the operator'sclose(). How do you want to handle it? -
Shutdown contract isn't pinned by a test. The 6 new unit tests cover discovery happy/sad paths, but none verify that
cache.close()releases the server. Combined with #1, that means the lifecycle invariant (whichever form it takes) lives entirely in reviewer memory. TheFakeMCPServerstub already has the right shape — adding acloseCalledflag plus a test that closes the cache and asserts it flipped is a few lines. Worth doing?
| shortTermMemState = getRuntimeContext().getMapState(shortTermMemStateDescriptor); | ||
|
|
||
| resourceCache = new ResourceCache(agentPlan.getResourceProviders()); | ||
| JavaMCPResourceDiscovery.discoverJavaMCPResources( |
There was a problem hiding this comment.
JavaMCPResourceDiscovery.discoverJavaMCPResources runs synchronously here, which means a slow or briefly-unreachable MCP server stalls (or fails) the entire operator's startup. Discussion #543 anticipates this with failFastOnStartup(true) as the default and a graceful-degradation opt-in — this PR ships fail-fast only. Is the intent to defer the opt-in to a follow-up, and does the current shape leave room for it (e.g., a per-server try/catch boundary around the discovery loop)?
| continue; | ||
| } | ||
|
|
||
| Object mcpServer = rp.provide(null); |
There was a problem hiding this comment.
rp.provide(null) passes null as ResourceContext. PythonMCPResourceDiscovery at PythonMCPResourceDiscovery.java:73 passes cache.getResourceContext() for the same role. Today MCPServer's constructor just stores the field without dereferencing it, so the null is benign — but if a future Java MCP server resolves a dependent resource through getResourceContext(), it would NPE here in a way that doesn't manifest on the Python side. Any reason to keep the null rather than match the Python path?
|
|
||
| // Tools and prompts are NOT serialized into the plan (they are runtime-discovered) | ||
| assertFalse( | ||
| json.contains("\"add\"") && json.contains("java_serializable"), |
There was a problem hiding this comment.
assertFalse(
json.contains("\"add\"") && json.contains("java_serializable"),
"JSON should not contain a serialized 'add' tool provider");This passes when either substring is missing, so a regression that drops "add" from the JSON for an unrelated reason would silently make this test green — the very class of bug the test is meant to catch. Since the design contract you're locking down is "no java_serializable providers for MCP-discovered resources at all," one alternative in case it helps:
assertFalse(json.contains("java_serializable"),
"JSON should not contain any java_serializable provider entries (MCP discovery is deferred)");
Linked issue: 608
Purpose of change
MCP tools and prompts were previously discovered at compile time inside AgentPlan.extractJavaMCPServer(): an MCPServer connection was opened, listTools()/listPrompts() were called over the network, results were serialized into the plan, and the connection was immediately closed. This had three problems:
This PR implements Part 1: Runtime Discovery from discussion #543. MCP tool and prompt discovery is moved from AgentPlan construction to ActionExecutionOperator.open():
Tests
API
No public API changes
Documentation
doc-neededdoc-not-neededdoc-included