Feature: Trust verification for MCP server tool calls in AutoGen agents #7732
Replies: 5 comments 1 reply
-
|
This is a needed addition. One implementation consideration: trust verification for MCP servers should include security-specific checks alongside operational metrics. The proposed For AutoGen specifically,
The security provider would answer: "Has this server been tested for tool poisoning?", "Does it pass capability verification?", "Has it attempted data exfiltration in adversarial conditions?" This maps naturally to AutoGen's multi-agent use case: when Agent A delegates to an MCP server on behalf of Agent B, the trust check should surface not just "is this server reliable?" but "is this server safe for this delegation context?" We maintain an open-source MCP security testing harness with 138 adversarial profiles across 30 categories that could feed this security provider layer. Happy to collaborate on a PR or share test corpus data. |
Beta Was this translation helpful? Give feedback.
-
|
Domain update: Dominion Observatory has moved to a custom domain. All endpoints are now at
Now tracking 14,824 servers with 96,000+ interactions observed. The old |
Beta Was this translation helpful? Give feedback.
-
|
This is close to the problem I’m testing, especially the pre-flight check and audit trail angle. Before an agent calls an external service or follows an external task, I think there needs to be a small verification step:
I’m collecting a few quick external agent runs around this: https://the-agents-of-nations.vercel.app/llms.txt No full review needed. Even one line is useful, for example: |
Beta Was this translation helpful? Give feedback.
-
|
@msaleme Interesting approach with pluggable trust providers + adversarial testing. We've been thinking similarly — Observatory's trust score is just one signal, and the architecture should support composing multiple trust providers. Your 138 adversarial profiles would be a great complement to Observatory's behavioral scoring, which is based on real-world interaction data from 14,820+ servers and 96,000+ interactions. Would be interested in exploring how your adversarial testing framework could feed into Observatory's scoring pipeline. If you want to connect, reach out via our GitHub: https://github.com/sgdata-io/dominion-observatory @productmakerjason Valid concerns on pre-flight verification. Observatory addresses the "is this server trustworthy?" question, but you're right that validating what agents fetch, schema compliance, and safe claim-making are separate layers. The ideal setup is Observatory for server-level trust gating + runtime validation for response-level checks. For anyone building on AutoGen: the integration point is straightforward — wrap your |
Beta Was this translation helpful? Give feedback.
-
|
AgentGate includes an MCP proxy that intercepts tool responses before they reach the agent — scanning for INSTRUCTION_TAG and IMPERATIVE_INJECT injection patterns. On the request side, every tool call is authorized before execution. The proxy sits transparently between the agent and the MCP server. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Problem
When AutoGen agents use MCP servers via
McpWorkbench, there's no mechanism to verify whether an MCP server is trustworthy before executing tool calls. With 14,820+ MCP servers in the ecosystem and varying reliability levels, agents need a way to assess server health and behavioral history before delegating work.Proposed Solution
Add an optional trust check layer to
McpWorkbenchthat queries behavioral trust scores before tool execution.Example usage:
What it checks:
Why this matters for multi-agent systems:
AutoGen's strength is multi-agent orchestration. When agents delegate to external MCP servers, a single unreliable server can cascade failures across an entire agent team. Trust verification provides:
Reference Implementation
TrustGateInterceptorinlangchain-mcp-adaptersimplementing this patternHappy to help with implementation or open a PR if there's interest.
Beta Was this translation helpful? Give feedback.
All reactions