Agent tool verification. Pre-execution gating. Eval-driven optimization.
Agents lie about tool calls. ToolProof catches them before and after execution.
AI agents claim they searched a database, read a file, or called an API. Under adversarial conditions, 91.1% of the time they hallucinate the results. They report data that was never returned. They reference tool calls that never happened. They execute destructive commands without authorization.
No tool on the market detects this.
ToolProof does two things:
- Pre-execution gating -- AEGIS-style policy enforcement that blocks dangerous tool calls before they run.
- Post-execution verification -- signed receipts that prove what actually happened, cross-referenced against what the agent claims.
pip install toolproof# Run your agent through ToolProof. All tool calls recorded automatically.
toolproof wrap -- python my_agent.py
# Proxy mode: sit between agent and tool server
toolproof proxy --target http://localhost:3000
# Import from Claude Code sessions
toolproof import-claude
# Import from Hermes agent
toolproof import-hermes --profile nashir
# Import from OpenClaw
toolproof import-openclaw
# Import from everything
toolproof import-allimport toolproof
toolproof.patch_openai() # Patches globally, zero config
toolproof.patch_anthropic() # Same for Anthropic
# Now every API call with tools generates signed receipts
import openai
client = openai.OpenAI()
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Search the database"}],
tools=[...],
)
# Receipts already recorded at ~/.toolproof/receipts.jsonlfrom toolproof import ToolProxy, ReceiptStore, Verifier, AgentClaim
store = ReceiptStore()
proxy = ToolProxy(store, secret="your-key")
# Wrap your tools
safe_search = proxy.wrap(search_database)
result = safe_search(query="SELECT * FROM users")
# Verify claims
verifier = Verifier(store, secret="your-key")
claim = AgentClaim(tool_name="search_database", arguments={"query": "SELECT * FROM users"})
result = verifier.verify_claim(claim)
print(result.verdict) # VERIFIED, UNVERIFIED, or TAMPEREDfrom toolproof.gate import Gate, Policy
policy = Policy.load() # from ~/.toolproof/policy.json
gate = Gate(policy)
decision = gate.check("Bash", {"command": "rm -rf /"})
# Decision(action="block", reason="Destructive shell command")
decision = gate.check("Read", {"file_path": "/src/main.py"})
# Decision(action="allow")Block, allow, or hold tool calls for human review. Policy-driven. No code changes in your agent.
17 CLI commands.
| Command | What it does |
|---|---|
toolproof analyze |
Run trust analytics on receipt history |
toolproof ci |
One-shot CI trust check |
toolproof clear |
Clear all receipts |
toolproof config |
Configure settings |
toolproof feedback |
Generate actionable feedback for agent frameworks |
toolproof github-action |
Print GitHub Action template |
toolproof import-all |
Import from all sources |
toolproof import-claude |
Import from Claude Code sessions |
toolproof import-hermes |
Import from Hermes agent logs |
toolproof import-openclaw |
Import from OpenClaw skill logs |
toolproof inspect <id> |
Inspect a specific receipt |
toolproof proxy --target <url> |
Start HTTP proxy that records tool calls |
toolproof report |
Show all recorded receipts |
toolproof status |
Show receipt store status |
toolproof verify <file> |
Verify agent claims against receipts |
toolproof watch |
Live monitoring dashboard |
toolproof wrap -- <command> |
Run command with automatic recording |
toolproof/
__init__.py # Package entry, SDK patch exports
__main__.py # python -m toolproof
analytics.py # Trust analytics, pattern detection, cost hotspots
claude_reader.py # Claude Code session log parser
cli.py # Click CLI, 17 commands
display.py # Rich terminal output
feedback.py # Actionable feedback generator for agent frameworks
gate.py # Pre-execution gating (AEGIS-style policy enforcement)
html_report.py # Standalone dark-theme HTML trust reports
http_proxy.py # HTTP proxy, protocol auto-detection
interceptors.py # Tool call interception layer
proxy.py # Function-level tool wrapping
receipt.py # Signed execution receipts, JSONL store
safepath.py # Path validation, traversal prevention
sdk_patch.py # OpenAI/Anthropic SDK monkey-patching
trust.py # Trust scoring, grading, risk assessment
verifier.py # Claim verification engine
watch.py # Live monitoring dashboard
Agent --> ToolProof Proxy --> Actual Tool
| |
| signed receipt |
|<-------------------|
v
Receipt Store (signed JSONL)
ToolProof intercepts tool calls through one of these methods:
- HTTP proxy -- sits between agent and tool server, records everything
- SDK patch -- monkey-patches OpenAI/Anthropic SDKs to auto-record
- Function wrapper -- wraps individual functions
- Log import -- reads existing logs from Claude Code, Hermes, OpenClaw
Agent wants to call tool X with args Y
|
v
Gate checks against policy
|
allow / block / hold
|
v
Tool executes (or doesn't)
AEGIS-style policy enforcement. Define rules in ~/.toolproof/policy.json. Block destructive commands, restrict file access, require human approval for sensitive operations. Aligned with Microsoft Agent Governance Toolkit patterns and W3C Agentic Integrity Verification draft.
Agent claims "I called X with Y and got Z"
|
v
Verifier cross-references against receipts
|
v
Trust Score: VERIFIED / UNVERIFIED / TAMPERED
Run agent --> Record receipts --> Analyze patterns --> Generate feedback --> Improve agent
^ |
|__________________________________________________________________________|
The analytics module finds which tools get hallucinated most, which models produce the lowest trust scores, and where token costs concentrate. The feedback module turns those findings into actionable config changes for Hermes profiles, OpenClaw config, and system prompts.
This is eval-driven development applied to tool calling. Measure. Find patterns. Improve systematically. Close the loop.
| Verdict | Meaning |
|---|---|
| VERIFIED | Claim matches a receipt |
| UNVERIFIED | No matching receipt (possible hallucination) |
| TAMPERED | Receipt exists but claim doesn't match (definite hallucination) |
trust_score = verified / (verified + unverified + tampered)
| Grade | Score | Risk |
|---|---|---|
| A | 95%+ | LOW |
| B | 85-94% | LOW |
| C | 70-84% | MEDIUM |
| D | 50-69% | MEDIUM |
| F | <50% | HIGH |
Every receipt records execution duration. Analytics aggregates cost by tool, model, session, and time window. Find expensive calls. Find broken caching. Find the 20% of tool calls eating 80% of your budget.
The proxy sits between your agent and its tools. It forwards every request, records a signed receipt, and the agent never knows it is being watched.
# Proxy to a local tool server
toolproof proxy --port 8080 --target http://localhost:3000
# Proxy OpenAI API calls
toolproof proxy --port 9090 --target https://api.openai.com
# Proxy Hermes tool server
toolproof proxy --target http://localhost:5001
# Proxy OpenClaw
toolproof proxy --target http://localhost:8000The proxy auto-detects and parses:
- OpenAI chat completions (tool_calls)
- Anthropic messages (tool_use blocks)
- MCP JSON-RPC (tools/call)
- Hermes skill executions
- OpenClaw commands
- Generic REST endpoints
Run any agent command with automatic interception:
# Wrap sets up a proxy and forwards env vars automatically
toolproof wrap -- python agent.py
toolproof wrap -- node bot.js
toolproof wrap --target http://localhost:5001 -- hermes run --profile nashirThe child process gets OPENAI_BASE_URL, ANTHROPIC_BASE_URL, and HTTP_PROXY pointed at the ToolProof proxy.
# Import recent sessions
toolproof import-claude
# Import specific session
toolproof import-claude --session abc123
# Import more sessions
toolproof import-claude --limit 20Reads ~/.claude/projects/ JSONL files. Extracts every tool_use/tool_result pair as a signed receipt.
# Import all profiles
toolproof import-hermes
# Import specific profile
toolproof import-hermes --profile nashir
toolproof import-hermes --profile mkhlabtoolproof import-openclawtoolproof import-all# Run analytics on your receipt history
toolproof analyze
# Generate actionable feedback for your agent framework
toolproof feedbackAnalytics finds:
- Which tools get hallucinated most
- Which models produce the lowest trust
- Cost hotspots (broken caching, expensive repeated calls)
- Failure patterns by time of day, session, source
Feedback generates specific config changes for:
- Hermes profiles (skill weights, model selection)
- OpenClaw config (tool permissions, routing)
- System prompts (add verification instructions)
- Generic JSON for any framework
# Print to stdout
toolproof report --html > trust-report.html
# Write to file
toolproof report --html --output report.htmlGenerates a standalone dark-theme HTML page with:
- Trust score card (grade, risk level)
- Verification results table
- Tool execution summary
- Duration and error stats
# Pass if trust >= 80% and at least 1 receipt
toolproof ci --min-trust 0.8
# Strict: 90% trust, minimum 10 receipts
toolproof ci --min-trust 0.9 --min-receipts 10
# JSON for scripts
toolproof ci --json-outputExit codes: 0 = pass, 1 = fail.
# Watch in real-time
toolproof watch
# Watch with threshold (exits 1 if trust drops below)
toolproof watch --min-trust 0.8 --timeout 60# Print the template
toolproof github-action- name: Verify tool calls
run: toolproof ci --min-trust 0.8 --min-receipts 5ToolProof ships as a native OpenClaw plugin. ClawHub publishable.
openclaw/
clawhub.json # ClawHub package manifest
extensions/ # OpenClaw extension points
hooks/ # Pre/post execution hooks
skills/ # ToolProof skills for OpenClaw agents
Install into OpenClaw and every skill execution gets a signed receipt. Pre-execution gating applies to OpenClaw commands. Feedback writes directly to OpenClaw config.
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": 1712400000.0,
"tool_name": "search_database",
"arguments": {"query": "SELECT * FROM users"},
"response": [{"id": 1, "name": "Alice"}],
"error": null,
"duration_ms": 142.3,
"hash": "a1b2c3d4e5f6...",
"hmac_sig": "9a8b7c6d5e4f..."
}Receipts are signed with SHA-256. Optional HMAC-SHA256 with a secret key for tamper-proof verification.
Security hardened through 2 rounds of adversarial pen-testing. All bypasses found and fixed.
- Receipts are cryptographically signed (SHA-256 + optional HMAC)
- Pre-execution gating blocks dangerous tool calls before they run
- Path traversal prevention on all file operations
- Config stored at
~/.toolproof/config.jsonwith0600permissions - Secret keys are never printed in full
- Proxy does not modify request or response content
- All receipt data is stored locally
- No external telemetry, no phone-home
This project stands on the shoulders of specific people and projects:
- @anthropic / Claude Code -- where the problem was first discovered. Watching Claude claim it ran tools that never executed is what started this.
- @OpenAI -- the
tool_useformat standard that every agent framework now follows. ToolProof parses it natively. - @steipete (Peter Steinberger) / @OpenClaw -- native plugin platform. ToolProof ships as a first-class OpenClaw plugin with ClawHub publishing.
- @karpathy (Andrej Karpathy) -- eval-driven development philosophy. The analytics and feedback loop in ToolProof is directly inspired by his approach: measure everything, find patterns, improve systematically. No vibes.
- @LangChainAI -- tool scoping patterns that informed how ToolProof intercepts and classifies tool calls.
- Microsoft Agent Governance Toolkit -- patterns for policy-driven agent control that shaped the gating module.
- AEGIS Research -- pre-execution firewall concept. ToolProof's gating is an implementation of this idea.
- W3C Agentic Integrity Verification -- draft specification for agent transparency and verifiability that ToolProof aligns with.
- Saudi AI Community -- testing, feedback, and the push to ship it.
Built with input from the Saudi AI Community.
MIT -- Musa the Carpenter