Feature: Trust verification for MCP server tool calls in AutoGen agents #7732

vdineshk · 2026-05-22T03:59:17Z

vdineshk
May 22, 2026

Problem

When AutoGen agents use MCP servers via McpWorkbench, there's no mechanism to verify whether an MCP server is trustworthy before executing tool calls. With 14,820+ MCP servers in the ecosystem and varying reliability levels, agents need a way to assess server health and behavioral history before delegating work.

Proposed Solution

Add an optional trust check layer to McpWorkbench that queries behavioral trust scores before tool execution.

Example usage:

from autogen_ext.tools.mcp import McpWorkbench

workbench = McpWorkbench(
    server_params={"url": "https://some-mcp-server.example.com/mcp"},
    trust_check=True,
    min_trust_score=60,  # Block servers scoring below 60/100
)

What it checks:

Trust score (0-100) — composite behavioral score based on observed interactions
Latency (avg, p95) — real-world response times
Success rate — percentage of successful tool calls
Uptime — server availability over rolling windows

Why this matters for multi-agent systems:

AutoGen's strength is multi-agent orchestration. When agents delegate to external MCP servers, a single unreliable server can cascade failures across an entire agent team. Trust verification provides:

Pre-flight checks — block unreliable servers before they cause failures
Dynamic routing — choose the most reliable server when multiple options exist
Audit trails — compliance for regulated industries (MiCA Article 12, EU enforcement July 2026)

Reference Implementation

Dominion Observatory — behavioral trust scoring API for 14,820+ MCP servers
CTEF v0.3.2 — Cross-Ecosystem Trust Evidence Framework spec
LangChain already has a TrustGateInterceptor in langchain-mcp-adapters implementing this pattern
MCP spec discussion: Proposal: Cross-Ecosystem Trust Evidence Framework (CTEF) for MCP server behavioral trust scoring modelcontextprotocol/modelcontextprotocol#2767

Happy to help with implementation or open a PR if there's interest.

msaleme · 2026-05-22T14:17:23Z

msaleme
May 22, 2026

This is a needed addition. One implementation consideration: trust verification for MCP servers should include security-specific checks alongside operational metrics.

The proposed trust_check=True with min_trust_score=60 is a good start for latency/uptime/success rate. But in practice, we've found that servers can score well operationally while failing security tests — e.g., a server that responds quickly but attempts to exfiltrate in-context credentials when probed with adversarial prompts.

For AutoGen specifically, McpWorkbench could support pluggable trust providers:

Operational provider (latency, uptime) — what CTEF/Dominion Observatory already covers
Security provider — adversarial test results from a security harness

The security provider would answer: "Has this server been tested for tool poisoning?", "Does it pass capability verification?", "Has it attempted data exfiltration in adversarial conditions?"

This maps naturally to AutoGen's multi-agent use case: when Agent A delegates to an MCP server on behalf of Agent B, the trust check should surface not just "is this server reliable?" but "is this server safe for this delegation context?"

We maintain an open-source MCP security testing harness with 138 adversarial profiles across 30 categories that could feed this security provider layer. Happy to collaborate on a PR or share test corpus data.

https://github.com/msaleme/red-team-blue-team-agent-fabric

0 replies

vdineshk · 2026-05-23T05:09:26Z

vdineshk
May 23, 2026
Author

Domain update: Dominion Observatory has moved to a custom domain.

All endpoints are now at https://dominionobservatory.com:

Live: https://dominionobservatory.com
AutoGen adapter: pip install dominion-observatory-autogen
Trust score: GET https://dominionobservatory.com/api/trust?server=<url>
Gateway proxy: POST https://dominionobservatory.com/gateway/{target-url}
Get started: https://dominionobservatory.com/get-started

Now tracking 14,824 servers with 96,000+ interactions observed. The old dominion-observatory.sgdata.workers.dev URLs continue to work as redirects. Contact: info@dominionobservatory.com

0 replies

productmakerjason · 2026-05-23T20:37:35Z

productmakerjason
May 23, 2026

This is close to the problem I’m testing, especially the pre-flight check and audit trail angle.

Before an agent calls an external service or follows an external task, I think there needs to be a small verification step:

what did it actually fetch?
did it select a real task or invent one?
did it follow the expected schema?
did it stop safely before claiming anything unverified?

I’m collecting a few quick external agent runs around this:

https://the-agents-of-nations.vercel.app/llms.txt

No full review needed. Even one line is useful, for example:
“fetched llms.txt, failed at tasks.json”
or
“selected a task_id but skipped schema.”

0 replies

vdineshk · 2026-05-25T01:38:52Z

vdineshk
May 25, 2026
Author

@msaleme Interesting approach with pluggable trust providers + adversarial testing. We've been thinking similarly — Observatory's trust score is just one signal, and the architecture should support composing multiple trust providers. Your 138 adversarial profiles would be a great complement to Observatory's behavioral scoring, which is based on real-world interaction data from 14,820+ servers and 96,000+ interactions.

Would be interested in exploring how your adversarial testing framework could feed into Observatory's scoring pipeline. If you want to connect, reach out via our GitHub: https://github.com/sgdata-io/dominion-observatory

@productmakerjason Valid concerns on pre-flight verification. Observatory addresses the "is this server trustworthy?" question, but you're right that validating what agents fetch, schema compliance, and safe claim-making are separate layers. The ideal setup is Observatory for server-level trust gating + runtime validation for response-level checks.

For anyone building on AutoGen: the integration point is straightforward — wrap your McpWorkbench or tool execution with a trust check before connecting to any external MCP server. The API is a single GET call, no SDK needed. Free tier at 50 queries/day for development.

1 reply

productmakerjason May 25, 2026

@msaleme Thanks :)
this distinction is really helpful.

That’s exactly the layer separation I’m trying to understand:

Observatory answers: “is this external server/tool trustworthy?”
The test I’m working on asks: “did the agent actually follow the evidence chain before claiming completion?”

So the response-level checks are things like fetched files, real task_id, schema read, payload validity, and no submission claim without a receipt.

I agree these should probably be separate layers.

Curious if you see this kind of tiny external task-flow test as a useful runtime validation case alongside server-level trust gating.

ElamOlame31 · 2026-05-28T01:12:56Z

ElamOlame31
May 28, 2026

AgentGate includes an MCP proxy that intercepts tool responses before they reach the agent — scanning for INSTRUCTION_TAG and IMPERATIVE_INJECT injection patterns. On the request side, every tool call is authorized before execution. The proxy sits transparently between the agent and the MCP server.

https://www.tryagentgate.com/

https://github.com/ElamOlame31/agentgate-public

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Trust verification for MCP server tool calls in AutoGen agents #7732

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Feature: Trust verification for MCP server tool calls in AutoGen agents #7732

Uh oh!

vdineshk May 22, 2026

Problem

Proposed Solution

Example usage:

What it checks:

Why this matters for multi-agent systems:

Reference Implementation

Replies: 5 comments · 1 reply

Uh oh!

msaleme May 22, 2026

Uh oh!

vdineshk May 23, 2026 Author

Uh oh!

Uh oh!

productmakerjason May 23, 2026

Uh oh!

vdineshk May 25, 2026 Author

Uh oh!

productmakerjason May 25, 2026

Uh oh!

ElamOlame31 May 28, 2026

vdineshk
May 22, 2026

Replies: 5 comments 1 reply

msaleme
May 22, 2026

vdineshk
May 23, 2026
Author

productmakerjason
May 23, 2026

vdineshk
May 25, 2026
Author

ElamOlame31
May 28, 2026