Skip to content

MCP server: in-conversation approval (surface 'ask the human' through the chat) #4

@nmrtn

Description

@nmrtn

Problem

Through the MCP transport, blacktea can currently only auto-approve or reject. The two human-approval channels both break in an MCP context:

  • approval: "console" writes the prompt to stdout, which corrupts the JSON-RPC stdio stream the MCP transport owns.
  • approval: "callback" needs onApprovalNeeded, which the MCP server doesn't wire (and can't sensibly wire — there's no human function to call from inside a spawned subprocess).

So the "ask the human first" flow — the entire middle tier of the policy model — is unavailable exactly where it's most natural: a chat interface where the human is sitting right there.

Evidence (live test, 2026-05-28)

Tested blacktea-mcp inside Hermes Agent on a real server. Asked the agent to fetch a paid endpoint. Policy auto-approved only under $0.001; the endpoint charged $0.01, so the policy correctly rejected. The agent then said, verbatim:

"Approve manually — the policy should have prompted you to approve"

The agent expected an in-chat approval prompt and there was no mechanism to produce one. The model's own behavior surfaced the gap. For personal-agent platforms (Hermes, OpenClaw) this is the missing piece between "auto-approve everything small" and "reject everything big."

Design options

Option A — two-tool pattern (recommended)

When the policy returns approval-needed, pay does NOT throw. It returns a structured pending result:

{
  "ok": false,
  "status": "approval_required",
  "intent_id": "intent_abc123",
  "amount": 0.01,
  "currency": "USDC",
  "recipient": "https://...",
  "reason": "above auto-approve limit",
  "expires_at": "2026-05-28T10:15:00Z"
}

The agent relays this to the human in chat ("This costs 0.01 USDC, approve?"). A new approve_payment tool takes the intent_id. When the human says yes, the agent calls approve_payment, and blacktea completes the settle using the staged intent.

Works on every MCP host. Maps cleanly to the existing PaymentIntent model. Explicit, no protocol-version dependency.

Option B — MCP elicitation

Use the MCP elicitation capability (server requests input from the user through the client). Cleaner single-call UX, but host support varies and Hermes/OpenClaw elicitation support is unverified. Could be a later enhancement on top of Option A.

Option C — notification + poll

pay returns pending, agent polls a payment_status tool. Clunkier; only worth it if A and B both prove infeasible.

Recommendation

Ship Option A. It's the lowest-common-denominator that works in Hermes, OpenClaw, Claude Desktop, and Cursor today.

Relationship to #3

#3 is out-of-band approval via a separate CLI process (blacktea approve <id> from a different shell) — the tightest trust boundary, for high-stakes wallets where the agent must never be in the approval path.

This issue is in-conversation approval through the same agent chat — the best UX for personal agents where the human and the agent share one interface.

They're complementary, not competing. Different trust models for different users. A mature blacktea supports both and lets the policy choose: approval: "out_of_band" vs approval: "chat".

Priority

Not blocking v0.1.0, but this is THE feature for the Hermes/OpenClaw positioning. The personal-agent launch story is "your agent asks before it spends" — and right now, through MCP, it can't ask. High priority for the post-launch v0.1.x line.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions