Add a way to validate that MCP tool descriptions #1403

geoand · 2025-04-02T13:20:51Z

This is done by utilizing an LLM to detect whether the tool description is malicious and could lead to
a Tool Poisoning Attack (TPA)

TODO:

Add caching
Add tests
Add docs

geoand · 2025-04-02T13:21:03Z

@jmartisk WDYT?

This is done by utilizing an LLM to detect whether the tool description is malicious and could lead to a Tool Poisoning Attack (TPA)

jmartisk · 2025-04-02T13:43:42Z

Should we also pass the tool's name and argument list rather than just description so that the LLM can detect a discrepancy between the declaration and description?
Shouldn't we rather implement it as a functionality of the DefaultMcpClient in upstream lc4j rather than extend it? This doesn't allow for much extensibility, and users of pure lc4j wouldn't be able to use it.

geoand · 2025-04-02T13:45:10Z

Shouldn't we rather implement it as a functionality of the DefaultMcpClient in upstream lc4j rather than extend it?

Sure, but I don't want to wait a month in order to get this in :)

Should we also pass the tool's name and argument list rather than just description so that the LLM can detect a discrepancy between the declaration and description?

Can you elaborate a little more?

jmartisk · 2025-04-03T04:53:39Z

Shouldn't we rather implement it as a functionality of the DefaultMcpClient in upstream lc4j rather than extend it?

Sure, but I don't want to wait a month in order to get this in :)

Kinda understandable, but it will, in the long term, turn against us, if we keep adding stuff here that should be in the upstream project :( especially since I reckon this may be treated as a CVE in the future, it should be fixed across the board.

Should we also pass the tool's name and argument list rather than just description so that the LLM can detect a discrepancy between the declaration and description?

Can you elaborate a little more?

I mean you currently only detect maliciousness from the description, but maybe using the tool's name and argument list may help detect maliciousness too? Like, if an argument requests data that should be completely irrelevant to the tool, or the name says something completely different from the description. But it's just a thought, maybe this isn't really a problem.

jmartisk · 2025-04-03T05:12:15Z

Doing it here would also be a bit troublesome for the caching of the tool list (which I hopefully will implement very soon, I'll try drafting it today) because to avoid validating the same tool multiple times, you'll have to somehow keep track of what tool you already validated. In the upstream, we can simply perform the validation once when we retrieve the initial list of tools, and when we receive a notification about a new tool. Quarkus won't have a hook into the processing of tool list change notifications (unless we implement some SPI) so we would have to keep some sort of map to keep track of what we already validated (and that kinda opens up a pathway to a DDoS attack by flooding the client with tool change notifications and making this map grow indefinitely).

I will prioritize implementing the caching very soon if that helps, and then you could build on top of it

jmartisk · 2025-04-03T05:14:40Z

... or just merge this now, if it's high priority, and then I will redo it when we have the proper solution in place (that should be possible in a backward-compatible way). I guess that would be the best course of action.

geoand · 2025-04-03T05:57:54Z

It's not high priority, so I'll wait, no problem.

jmartisk · 2025-04-03T10:39:25Z

I'll prioritize the tool list caching and change notifications implementation so we can get this going

geoand · 2025-04-03T10:41:52Z

Thanks a lot!

maxandersen · 2025-04-04T07:58:29Z

...time/src/main/java/io/quarkiverse/langchain4j/mcp/runtime/config/McpClientRuntimeConfig.java

+     * The named model to use in order to judge whether the descriptions of the tools provided by the MCP server
+     * are malicious. If they are, a warning will be printed and the tool will never be used.
+     */
+    Optional<String> toolValidationModelName();


this assumes to use a model within the default provider or how does actually end up working?

i.e. can I be using openai to validate but call via ollama sometihng for validation?

If you need the default model, you would just set this to default

i.e. can I be using openai to validate but call via ollama sometihng for validation?

yes

maxandersen

+1 on the idea!

jmartisk · 2025-04-07T05:51:05Z

FYI langchain4j/langchain4j#2817 is for the tool list caching (draft for now, I need a new release of quarkus-mcp-server to get the tests passing)

geoand · 2025-04-09T11:11:42Z

@jmartisk for #1416 I'm again going to use a delegate as I did here - just letting you know

geoand requested a review from a team as a code owner April 2, 2025 13:20

Add a way to validate that MCP tool descriptions

41aae44

This is done by utilizing an LLM to detect whether the tool description is malicious and could lead to a Tool Poisoning Attack (TPA)

geoand force-pushed the mcp-guardrail branch from 882fc50 to 41aae44 Compare April 2, 2025 13:25

geoand marked this pull request as draft April 2, 2025 13:27

maxandersen reviewed Apr 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a way to validate that MCP tool descriptions #1403

Add a way to validate that MCP tool descriptions #1403

Uh oh!

geoand commented Apr 2, 2025 •

edited

Loading

Uh oh!

geoand commented Apr 2, 2025

Uh oh!

jmartisk commented Apr 2, 2025

Uh oh!

geoand commented Apr 2, 2025 •

edited

Loading

Uh oh!

jmartisk commented Apr 3, 2025 •

edited

Loading

Uh oh!

jmartisk commented Apr 3, 2025

Uh oh!

jmartisk commented Apr 3, 2025 •

edited

Loading

Uh oh!

geoand commented Apr 3, 2025

Uh oh!

jmartisk commented Apr 3, 2025

Uh oh!

geoand commented Apr 3, 2025

Uh oh!

maxandersen Apr 4, 2025

Uh oh!

geoand Apr 4, 2025

Uh oh!

maxandersen left a comment

Uh oh!

jmartisk commented Apr 7, 2025

Uh oh!

geoand commented Apr 9, 2025

Uh oh!

Uh oh!

Add a way to validate that MCP tool descriptions #1403

Are you sure you want to change the base?

Add a way to validate that MCP tool descriptions #1403

Uh oh!

Conversation

geoand commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geoand commented Apr 2, 2025

Uh oh!

jmartisk commented Apr 2, 2025

Uh oh!

geoand commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmartisk commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmartisk commented Apr 3, 2025

Uh oh!

jmartisk commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geoand commented Apr 3, 2025

Uh oh!

jmartisk commented Apr 3, 2025

Uh oh!

geoand commented Apr 3, 2025

Uh oh!

maxandersen Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

geoand Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

maxandersen left a comment

Choose a reason for hiding this comment

Uh oh!

jmartisk commented Apr 7, 2025

Uh oh!

geoand commented Apr 9, 2025

Uh oh!

Uh oh!

geoand commented Apr 2, 2025 •

edited

Loading

geoand commented Apr 2, 2025 •

edited

Loading

jmartisk commented Apr 3, 2025 •

edited

Loading

jmartisk commented Apr 3, 2025 •

edited

Loading