Add Agent Host Protocol threat model #88

connor4312 · 2026-04-28T16:01:46Z

I would actually say the opposite. The protocol is just RPC and it doesn't have an "execute locally" command. A vanilla implementation will be 'safe' by default unless a client gives it e.g. local tools it could use to escape.

Maybe it's the wording "the current protocol" -- there are no protocol changes we would make that would make it safer than it is now.

We might add things like sandboxing in the future but this is an optional implementation, it doesn't make the protocol inherently more or less safe.

connor4312 · 2026-04-28T16:12:36Z

There is the converse as well -- resource commands are bidirectional. Clients should be intentional about the URIs it allows the server to interop with, such as by restricting them to only paths/directories the clients has already announced in a ContentRef

connor4312 · 2026-04-28T16:13:37Z

"Origin checks for browser-reachable endpoints, and rate-limit" is not sufficient to protect loopbacks. Loopbacks should require a randomized/secure token in their URIs to make them unguessable

connor4312 · 2026-04-28T16:14:41Z

Scope approvals, terminal claims, and tool completion to the owning client

Multi-client support is the goal of AHP in the first place. I would just drop this row.

connor4312 · 2026-04-28T16:16:41Z

Require provenance, signatures or allowlists, sandboxing, restricted environment secrets, and egress monitoring

That's quite a high bar. I can't imagine many agent hosts will do this. We don't do any of this for plugins in VS Code today.

connor4312 · 2026-04-28T16:19:12Z

A peer is already going to be authenticated. Standard best practice for resource management is good but I don't think this is a big deal. A client isn't authoritative on the state so I'm unsure what "Enforce message, resource, history, subscription, and terminal scrollback limits" would mean in that context

connor4312 · 2026-04-28T16:20:58Z

We don't do this. We let people stand up agent hosts and connect to them via websocket -- I have an agent host like this on my home server that I connect to from my devices. I don't tihnk we'd gain anything by removing this capability

connor4312 · 2026-04-28T16:21:46Z

Servers must sandbox filesystem access and terminal operations

This requirement currently cannot be fulfilled on Windows, and even on other OS' I think is too heavy to have as a "minimum requirement". Sandboxing is complex.

connor4312 · 2026-04-28T16:23:22Z

Similar to above https://github.com/microsoft/agent-host-protocol/pull/88/changes#r3155649734

-Original file line number
+Diff line change
@@ -0,0 +1,90 @@
+    # Threat Model
+    This document captures the core security assumptions, risks, and minimum controls for Agent Host Protocol (AHP) implementations. AHP is a bidirectional JSON-RPC protocol that lets clients connect to an agent host, subscribe to session state, dispatch actions, access resources, create terminals, authenticate to protected resources, and optionally contribute client-side tools and customizations.
+    ## Current safety status
+    The current protocol **is not safe by itself for untrusted remote use**. AHP defines the message shapes and state flows, but it does not itself establish peer identity, authorize capabilities, constrain token forwarding, sandbox resource access, sanitize rendered content, or make client-contributed tools safe to invoke. Implementations must add those controls before treating a remote host or client as safe.
+    ## Goal: untrusted mode
+    The desired end state is an **untrusted mode**: AHP and client implementations should let a user connect to an agent host and still get useful work products without risking local compromise or long-lived secrets. In this mode, the client treats the host as hostile by default, disables or tightly scopes client tools, avoids silent token forwarding, and only shares explicit, bounded inputs.
+    This mode is still valuable. A user might only want the agent to produce artifacts such as a `PLAN.md`, or might give the host a short-lived, least-privilege token that can push a PR as an external, non-collaborator identity. The threat model below describes current implementation risks so the protocol and clients can work toward this goal while preserving a hard boundary around the user's local client, local files, credentials, and durable account authority.
+    ## Security model
+    - **AHP messages are untrusted data, not authority.** A peer can spoof agent names, safety text, auth prompts, tool confirmations, URLs, terminal output, file diffs, and session state.
+    - **Transport identity is outside the wire protocol.** WebSocket, SSH, tunnel, MessagePort, or custom transports must authenticate peers before `initialize`. AHP `clientId` is not an identity unless it is bound to the authenticated transport.
+    - **Remote hosts are high-risk by default.** In SSH/tunnel/remote WebSocket scenarios, assume the remote agent host, agent SDK, model output, workspace, terminal environment, package installs, and all server messages may be compromised.
+    - **A compromised remote host is not automatically a compromised local client.** The local client becomes exposed when it renders hostile content, forwards tokens, serves local resources, or executes client-contributed tools.
+    - **Autonomous or "YOLO" agents expand the blast radius.** If an agent can run shell commands or install packages without review, treat the remote workspace, remote environment variables, package manager credentials, and tokens already sent to the host as compromised.
+    ## SSH / tunnel placement
+    ```mermaid
+    flowchart LR
+        subgraph Local["Client machine"]
+            Client["AHP client"]
+            Tools["Client tools"]
+            Secret["Client token / secret"]
+            Relay["SSH / tunnel / WebSocket client"]
+        end
+        subgraph Link["Transport"]
+            Pipe["Authenticated bidirectional stream"]
+        end
+        subgraph Remote["Remote machine"]
+            Host["AHP server / agent host"]
+            Agent["Agent SDK"]
+            Workspace["Workspace and environment"]
+            Packages["npm / pip packages"]
+        end
+        Auth["Protected resource provider"]
+        Client <--> Tools
+        Client <--> Relay
+        Relay <--> Pipe
+        Pipe <--> Host
+        Host <--> Agent
+        Agent <--> Workspace
+        Agent --> Packages
+        Packages --> Workspace
+        Client --> Auth
+        Auth --> Secret
+        Secret --> Host
+        Host -.-> Tools
+        Tools -.-> Host
+        style Remote stroke:#cc0000,stroke-width:2px,stroke-dasharray: 5 5,fill:transparent
+    ```
+    The local client and its tools remain on the client machine. The agent host and agent SDK run where the remote server runs. Dotted arrows show logical client-tool calls and results over AHP; they must still cross the authenticated transport and pass local client policy.
+    ## Primary threats and controls
+    | Threat | Example compromise flow | Required posture |
+    | --- | --- | --- |
+    | **Hostile server content compromises the client** | A remote host sends malicious Markdown, HTML/SVG, terminal escapes, links, schema text, diffs, or content references that exploit or trick the client. | Render all peer-provided content as untrusted. Sanitize Markdown, block command links and unsafe URI schemes, constrain SVG/images, neutralize dangerous terminal sequences, and enforce size limits. |
+    | **Token or secret exfiltration** | The host advertises `protectedResources`; the client obtains a GitHub/Azure/etc. token and sends it via `authenticate` to a compromised host. | Require authenticated transport, server identity, and explicit per-host/per-resource consent before token delivery. Use scoped, short-lived, audience-bound tokens. Never send refresh tokens or log bearer tokens. |
+    | **Client-contributed tool abuse** | The active client contributes tools; the server starts a tool call with `toolClientId` and attacker-controlled input that causes the client to run tasks, inspect local context, or return sensitive output. | Client tools must be explicit, allowlisted, capability-scoped, and locally authorized. Treat server `confirmed: "not-needed"` as advisory, not as client approval. Confirm tools that read/write local data, run commands, open URLs, or use credentials. |
+    | **Server-side workspace compromise by malicious clients** | An unauthorized client invokes `resourceWrite`, `resourceDelete`, terminal creation/input, tool confirmations, or config/customization actions against the host. | Authenticate and authorize each client independently. Bind `clientId` to the transport identity. Canonicalize resource URIs, sandbox filesystem access, gate terminal operations, and audit privileged actions. |
+    | **WebSocket or tunnel exposure** | An agent host listens on a reachable interface; a browser, local malware, or network attacker connects and sends AHP commands. | Bind loopback by default, require explicit opt-in for network exposure, authenticate during WebSocket upgrade, use `wss` for remote connections, enforce Origin checks for browser-reachable endpoints, and rate-limit. |
+    | **Multi-client confusion** | One client races another to claim active-client status, approve a tool call, complete a client tool, or write terminal input. | Authorize every action against current server state and authenticated connection identity. Scope approvals, terminal claims, and tool completion to the owning client. Reject stale or replayed decisions. |
+    | **Plugin, customization, and package supply-chain execution** | A host loads a remote customization or a YOLO agent runs `npm install` / `pip install`; install scripts or plugin code read secrets and exfiltrate them. | Treat plugin loading and package installation as code execution. Require provenance, signatures or allowlists, sandboxing, restricted environment secrets, and egress monitoring. |
+    | **Denial of service and privacy leakage** | A peer sends huge JSON frames, deep state snapshots, large resources, unbounded terminal output, or logs prompts/tokens/file contents. | Enforce message, resource, history, subscription, and terminal scrollback limits. Apply backpressure and rate limits. Redact tokens, prompts, file contents, terminal output, and secrets from logs/telemetry. |
+    ## Minimum requirements for implementations
+. **Authenticate before protocol use.** Remote transports must authenticate peers before `initialize`; `clientId` must be bound to the authenticated connection.
+. **Encrypt non-local traffic.** Use `wss`, SSH, dev tunnel security, or equivalent protection for non-loopback connections.
+. **Make trust local.** Clients and servers must make authorization decisions in their own policy layer, not from peer-provided text, labels, or confirmation flags.
+. **Gate token delivery.** Sending OAuth/Bearer tokens to an agent host requires explicit consent and must be scoped to the intended resource and host.
+. **Constrain client tools.** Do not expose powerful client tools to untrusted hosts by default. When enabled, require local allowlists, argument validation, and user confirmation for sensitive operations.
+. **Constrain resource and terminal APIs.** Servers must sandbox filesystem access and terminal operations; clients that serve local resources must enforce their own scheme/path policy.
+. **Handle multi-client ownership.** Active-client state, tool calls, terminal claims, input requests, and approvals must be tied to the owning authenticated connection.
+. **Treat plugins and packages as executable.** Customizations, MCP servers, package installs, hooks, and skills require supply-chain policy and sandboxing.
+. **Validate and bound the protocol.** Use schema validation, fail closed on invalid messages, and enforce limits on frame size, JSON depth, resource size, subscriptions, replay history, and terminal output.
+. **Redact and minimize sensitive data.** Protocol payloads may contain code, prompts, file paths, tokens, terminal output, and secrets. Do not log or persist them unless necessary and protected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Agent Host Protocol threat model #88

Diff view

Diff view

There are no files selected for viewing

connor4312 Apr 28, 2026 •

edited

Loading

Uh oh!

connor4312 Apr 28, 2026

Uh oh!

connor4312 Apr 28, 2026

Uh oh!

connor4312 Apr 28, 2026

Uh oh!

connor4312 Apr 28, 2026

Uh oh!

connor4312 Apr 28, 2026

Uh oh!

connor4312 Apr 28, 2026

Uh oh!

connor4312 Apr 28, 2026 •

edited

Loading

Uh oh!

connor4312 Apr 28, 2026

Uh oh!

Uh oh!

Add Agent Host Protocol threat model #88

Are you sure you want to change the base?

Add Agent Host Protocol threat model #88

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

connor4312 Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

connor4312 Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

connor4312 Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

connor4312 Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

connor4312 Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

connor4312 Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

connor4312 Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

connor4312 Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

connor4312 Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

connor4312 Apr 28, 2026 •

edited

Loading

connor4312 Apr 28, 2026 •

edited

Loading