Skip to content

MCP client should validate tool descriptions for prompt injections and freeze them. #247968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pelikhan opened this issue May 1, 2025 · 4 comments
Assignees
Labels
chat-mcp under-discussion Issue is under discussion for relevance, priority, approach

Comments

@pelikhan
Copy link
Member

pelikhan commented May 1, 2025

The tool description returned by a MCP server may contain prompt injection strings in the tool description. The tool descriptions are inserted in the final prompt that gets processed by the client LLM.

Note that the rug-pull attack is a variant where the MCP server injects the prompt injection strings are a few uses or some other environment trigger. Thus, it is not detected on first load when the user reviews the tools.

Recommendation

  • Freeze the tool list and repeat any kind of validation when the tool list changes.
  • Run prompt injection detection services on the tool list to detect attempts at injecting prompts through the description.

This technique is implemented in genaiscript. https://microsoft.github.io/genaiscript/blog/mcp-tool-validation/

@connor4312
Copy link
Member

connor4312 commented May 1, 2025

We don't validate tool descripts aside from ensuring they're safe from the model's schema (e.g. not too long for 4o).

I'm also not sure how useful validating tool descriptions are when tool responses are non-deterministic and a much better place to mount any kind of prompt-injection attack.

Additionally, the tools exposed by an MCP server may be non-deterministic and change over time. In fact that is ideal behavior for say a browser tool, where it might only expose an "open browser" tool initially and then not expose tools to interact with that browser until it's open. (No sense eating up the context window unnecessarily)

@pelikhan
Copy link
Member Author

pelikhan commented May 1, 2025

Once the tool description have been validated, you would want to store a hash and redo the validation whenever a change is done. Otherwise, it is possible for a malicious mcp server to dynamically change their tool description to shadow or mutate their intents.

@connor4312
Copy link
Member

That could only reasonably done in an automated way, we would not want to bug the users with prompts whenever tools change.

@connor4312 connor4312 added the under-discussion Issue is under discussion for relevance, priority, approach label May 1, 2025
@pelikhan
Copy link
Member Author

pelikhan commented May 2, 2025

You could think of having a "lock" icon that allows the user to convey the intent to freeze the tool. At which point, it makes sense to notify that things changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chat-mcp under-discussion Issue is under discussion for relevance, priority, approach
Projects
None yet
Development

No branches or pull requests

3 participants