Skip to content

calbebop/Batesian

Repository files navigation

Batesian

Active adversarial security testing for AI agent protocols.

License Go Version Build

Batesian is a red-team CLI that sends crafted adversarial payloads against A2A and MCP protocol implementations to surface vulnerabilities in OAuth flows, push-notification callbacks, JWS handling, cross-session isolation, tool and metadata trust, and related behavior.

Batesian demo

Authorized use only. Only run Batesian against systems you own or have explicit written permission to test. Unauthorized use is illegal and unethical.

Secrets and TLS. Prefer BATESIAN_TOKEN (or your secret manager) over pasting long-lived credentials into shared terminals or CI logs. Use --skip-tls only when you must talk to a target with intentionally broken TLS (for example a local lab with self-signed certificates).


Why Batesian exists

A2A and MCP servers sit in sensitive workflows: OAuth and dynamic registration, outbound callbacks, signed agent metadata, long-lived tasks, and tool execution. Many failure modes only show up when the implementation processes attacker-shaped protocol traffic—for example a crafted registration or redirect, a push URL pointed at an unexpected host, or a malformed JWS that should be rejected outright.

Batesian automates that style of check: each rule drives concrete requests (and, where relevant, out-of-band signals) and records whether the target behaved safely. The goal is reproducible evidence and actionable remediation, not a one-time manual poke at the endpoint.


What Batesian tests

Batesian ships 18 A2A rules and 17 MCP rules, covering SSRF, OAuth abuse, JWS algorithm confusion, prompt injection, protocol downgrade, TLS enforcement, and more.

Each finding is classified as confirmed (exploit succeeded) or indicator (behavioral signal warranting manual review). All rules ship with CWE references and remediation guidance.


Quickstart

# Install (no API keys, no Python, no setup)
go install github.com/calbebop/batesian/cmd/batesian@latest

# Probe an A2A endpoint and map the attack surface
batesian probe --target https://agent.example.com --protocol a2a

# Full scan with SARIF output for GitHub Code Scanning
batesian scan --target https://agent.example.com --output sarif > results.sarif

# Run specific rules only
batesian scan --target https://agent.example.com --rule-ids a2a-push-ssrf-001,mcp-tool-poison-001

# Scan an authenticated MCP endpoint (static token)
batesian scan --target https://mcp.example.com --token "$TOKEN"

# Scan with automatic OAuth 2.0 client credentials token acquisition
batesian scan --target https://mcp.example.com \
  --token-url https://auth.example.com/oauth/token \
  --client-id my-client \
  --client-secret "$CLIENT_SECRET" \
  --oauth-scopes mcp:read,mcp:write

# Scan with OAuth 2.0 authorization code + PKCE (interactive; opens browser)
batesian scan --target https://mcp.example.com \
  --auth-url https://auth.example.com/authorize \
  --token-url https://auth.example.com/oauth/token \
  --client-id my-client \
  --oauth-scopes mcp:read

# Generate an annotated batesian.yaml config file
batesian init

Use scan for SARIF (for example GitHub Code Scanning uploads). The probe command does not support --output sarif; it is for quick reconnaissance with table or JSON output only.

More options for scan (filters, config file, custom rules, OAuth, and more): run batesian scan --help.


Rule packs

Attack rules are YAML files. Anyone can write new attack patterns without touching Go. Rules load at runtime thus no recompilation needed. See CONTRIBUTING.md for the full authoring guide including the rule schema, validation checklist, and testing requirements.


Python SDK

from batesian import Scanner

scanner = Scanner(target="https://agent.example.com")
results = scanner.run(rules=["a2a-push-ssrf-001", "mcp-tool-poison-001"])

for finding in results.findings:
    print(f"[{finding.severity}] {finding.rule_id}: {finding.title}")

assert results.critical_count == 0

See sdk/python/ for installation, full API reference, and CI integration examples.


Status

The rule engine and all 34 bundled rules are production-ready. New rules and protocol coverage are in active development. Star or watch to follow progress.


Contributing

Contributions welcome, especially new attack rules. No engine knowledge required to write a rule.

See CONTRIBUTING.md. Contributions are accepted under the same Apache License 2.0 as the rest of the project.

For vulnerable Python test servers, ports, and how rules map to each server, see testdata/README.md.


References


License

Apache 2.0. See LICENSE.

About

CLI for active adversarial testing of MCP and A2A agent protocols

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Contributors