Skip to content

Add a way to validate that MCP tool descriptions #1403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

geoand
Copy link
Collaborator

@geoand geoand commented Apr 2, 2025

This is done by utilizing an LLM to detect whether the tool description is malicious and could lead to
a Tool Poisoning Attack (TPA)

TODO:

  • Add caching
  • Add tests
  • Add docs

@geoand geoand requested a review from a team as a code owner April 2, 2025 13:20
@geoand
Copy link
Collaborator Author

geoand commented Apr 2, 2025

@jmartisk WDYT?

This is done by utilizing an LLM to detect whether the
tool description is malicious and could lead to
a Tool Poisoning Attack (TPA)
@geoand geoand marked this pull request as draft April 2, 2025 13:27
@jmartisk
Copy link
Collaborator

jmartisk commented Apr 2, 2025

  • Should we also pass the tool's name and argument list rather than just description so that the LLM can detect a discrepancy between the declaration and description?
  • Shouldn't we rather implement it as a functionality of the DefaultMcpClient in upstream lc4j rather than extend it? This doesn't allow for much extensibility, and users of pure lc4j wouldn't be able to use it.

@geoand
Copy link
Collaborator Author

geoand commented Apr 2, 2025

Shouldn't we rather implement it as a functionality of the DefaultMcpClient in upstream lc4j rather than extend it?

Sure, but I don't want to wait a month in order to get this in :)

Should we also pass the tool's name and argument list rather than just description so that the LLM can detect a discrepancy between the declaration and description?

Can you elaborate a little more?

@jmartisk
Copy link
Collaborator

jmartisk commented Apr 3, 2025

Shouldn't we rather implement it as a functionality of the DefaultMcpClient in upstream lc4j rather than extend it?

Sure, but I don't want to wait a month in order to get this in :)

Kinda understandable, but it will, in the long term, turn against us, if we keep adding stuff here that should be in the upstream project :( especially since I reckon this may be treated as a CVE in the future, it should be fixed across the board.

Should we also pass the tool's name and argument list rather than just description so that the LLM can detect a discrepancy between the declaration and description?

Can you elaborate a little more?

I mean you currently only detect maliciousness from the description, but maybe using the tool's name and argument list may help detect maliciousness too? Like, if an argument requests data that should be completely irrelevant to the tool, or the name says something completely different from the description. But it's just a thought, maybe this isn't really a problem.

@jmartisk
Copy link
Collaborator

jmartisk commented Apr 3, 2025

Doing it here would also be a bit troublesome for the caching of the tool list (which I hopefully will implement very soon, I'll try drafting it today) because to avoid validating the same tool multiple times, you'll have to somehow keep track of what tool you already validated. In the upstream, we can simply perform the validation once when we retrieve the initial list of tools, and when we receive a notification about a new tool. Quarkus won't have a hook into the processing of tool list change notifications (unless we implement some SPI) so we would have to keep some sort of map to keep track of what we already validated (and that kinda opens up a pathway to a DDoS attack by flooding the client with tool change notifications and making this map grow indefinitely).

I will prioritize implementing the caching very soon if that helps, and then you could build on top of it

@jmartisk
Copy link
Collaborator

jmartisk commented Apr 3, 2025

... or just merge this now, if it's high priority, and then I will redo it when we have the proper solution in place (that should be possible in a backward-compatible way). I guess that would be the best course of action.

@geoand
Copy link
Collaborator Author

geoand commented Apr 3, 2025

It's not high priority, so I'll wait, no problem.

@jmartisk
Copy link
Collaborator

jmartisk commented Apr 3, 2025

I'll prioritize the tool list caching and change notifications implementation so we can get this going

@geoand
Copy link
Collaborator Author

geoand commented Apr 3, 2025

Thanks a lot!

* The named model to use in order to judge whether the descriptions of the tools provided by the MCP server
* are malicious. If they are, a warning will be printed and the tool will never be used.
*/
Optional<String> toolValidationModelName();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assumes to use a model within the default provider or how does actually end up working?

i.e. can I be using openai to validate but call via ollama sometihng for validation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need the default model, you would just set this to default

i.e. can I be using openai to validate but call via ollama sometihng for validation?

yes

Copy link
Member

@maxandersen maxandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the idea!

@jmartisk
Copy link
Collaborator

jmartisk commented Apr 7, 2025

FYI langchain4j/langchain4j#2817 is for the tool list caching (draft for now, I need a new release of quarkus-mcp-server to get the tests passing)

@geoand
Copy link
Collaborator Author

geoand commented Apr 9, 2025

@jmartisk for #1416 I'm again going to use a delegate as I did here - just letting you know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants