Agent Poisoning Defense

Agent and Prompt Poisoning Defense

Prompt injection and agent poisoning are control-plane failures. Public issues, pull requests, comments, documents, web pages, email, MCP tool output, external agent results, and uploads are untrusted data, not trusted instructions.

Required Defenses

Quote or wrap untrusted input as data.
Ignore instructions embedded inside source content.
Verify actors and triggers for write-capable workflows.
Split read and write workflows.
Keep secrets out of untrusted-input jobs.
Use least-privilege tokens.
Restrict network egress and tool arguments.
Snapshot inputs at trigger time.
Redact output sinks such as logs, artifacts, comments, and memory writes.
Require approval for mutation.
Preserve provenance.

Case study: https://flatt.tech/research/posts/poisoning-claude-code-one-github-issue-to-break-the-supply-chain/

Source doc: https://github.com/Protocol-Wealth/pwcli-core/blob/main/docs/agent-poisoning-defense.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Agent Poisoning Defense

Agent and Prompt Poisoning Defense

Required Defenses

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally