A semantic authorization runtime for AI agents. Define what your agent can do, enforce it at the boundary, and keep a human in the loop for what matters.
Clawthority is a policy engine plugin for OpenClaw. It sits between your AI agent and every tool it calls, evaluates rules before execution, and - if policy says no - the call is never placed.
Agent tool call ─► Clawthority ─► Allow │ Deny │ Ask human ─► Audit log
Note
v1.x - stable API. The plugin API and policy bundle schema follow semantic versioning; breaking changes ship in a future major release. The version badge above reflects the current release (1.3.1).
- Why Clawthority
- What it is - and isn't
- Quickstart
- How it works
- Action registry
- Human-in-the-loop
- Configuration
- Budget control
- Documentation
- Development
- License
AI agents call tools. When a tool can delete files, send money, or talk to strangers, who decides whether that call goes through - the model, or you?
Skill-based safety (instruct the model to "please check first") fails the moment the model is wrong, distracted, or prompt-injected. Clawthority moves the decision outside the model's loop - into the code path between the agent and the tool.
| The Skill (model-enforced) | The Plugin (code-enforced) | |
|---|---|---|
| Lives in | Context window - model sees it | Execution path - between agent and tools |
| Enforces via | Model reasoning; asks it to comply | Code boundary; intercepts the call |
| Can be bypassed? | Yes - prompt injection, loop misfire | No - runs outside the model's loop |
| Gives you | Observability + soft stop | Hard enforcement + append-only audit log |
A skill asks the model to enforce. A plugin enforces regardless of what the model decides.
Clawthority is:
- A policy decision + enforcement layer for tool calls, installed as an OpenClaw plugin.
- A semantic authorizer - rules are written against canonical action classes (
filesystem.delete,payment.initiate), not brittle tool-name regexes. - A cryptographic capability system - HITL approvals are SHA-256-bound to
(action_class, target, payload_hash)at approval time. Param tampering after approval = auto-deny. - Two install modes -
open(default, implicit permit + critical forbids) for zero-friction installs, andclosed(implicit deny, explicit permits required) for locked-down production. Stage 1 capability/HITL gates and pipeline-level error handling fail closed in both modes.
Clawthority isn't:
- A model-safety or alignment layer. It does not enforce semantic constraints on prompt content, and it does not inspect tool outputs for sensitive data.
- A runtime for agents. OpenClaw still decides which tools the agent sees; Clawthority decides whether those calls run.
- A substitute for good action-class registration. Misregistered tools fall through to
unknown_sensitive_action, which is forbidden at priority 100 inclosedmode but implicitly permitted inopenmode unless you add an explicit forbid rule for it indata/rules.json— a signal you need to register the alias. - A full process sandbox. Clawthority enforces in-band tool calls only - calls routed through OpenClaw's tool dispatcher. Skills or workspace helpers that call
fs,child_process, or network APIs directly bypass the dispatcher and are out of scope. See docs/threat-model.md for the full boundary definition and recommended mitigations.
Install from npm into the OpenClaw plugins directory:
mkdir -p ~/.openclaw/plugins/clawthority
cd ~/.openclaw/plugins/clawthority
npm init -y >/dev/null
npm install @clawthority/clawthorityInstalling the package runs scripts/post-install.mjs, which writes a data/.installed marker under the plugin root. The marker gates policy activation — see isInstalled() in src/index.ts. No other files outside the plugin directory are touched.
Register in ~/.openclaw/config.json:
{ "plugins": ["clawthority"] }Note
The three companion soft-enforcement skills (human-approval, token-budget, whatdidyoudo) live in a separate clawthority-skills repository. They are optional and independent of the plugin.
By default Clawthority runs in open mode - implicit permit, with a critical-forbid safety net (shell.exec, code.execute, payment.initiate, credential.read, credential.write, credential.rotate). Note: unknown_sensitive_action is not in this safety net — unregistered tool names are implicitly permitted in open mode unless you add an explicit forbid rule for unknown_sensitive_action in data/rules.json. To fail closed on unknown tools out of the box, run in closed mode (implicit deny, explicit permits required) by setting the env var before launching your agent:
export CLAWTHORITY_MODE=closedMode is read once at activation - restart the agent to change it.
Customise the baseline by dropping hot-reloadable rules into data/rules.json:
[
{ "effect": "permit", "action_class": "filesystem.read" },
{ "effect": "forbid", "action_class": "payment.initiate", "priority": 90 },
{ "effect": "forbid", "resource": "tool", "match": "my_custom_tool" }
]Run your agent. A shell.exec call now terminates at the boundary:
[clawthority] │ DECISION: BLOCKED (cedar/forbid priority=100 rule=action:shell.exec) - Shell execution is unconditionally forbidden
Every block - plus every HITL outcome - is appended to data/audit.jsonl as structured JSONL with stage, rule, priority, and mode fields. See docs/troubleshooting.md for the recovery runbook.
Tip
data/rules.json and hitl-policy.yaml hot-reload within ~300ms. Anything else under src/ requires a gateway restart.
Agent picks a tool → Clawthority intercepts
│
│ normalize_action(toolName, params) → { action_class, target, payload_hash }
│ buildEnvelope(...) → ExecutionEnvelope
▼
┌──────────────────────── Pipeline ────────────────────────┐
│ Stage 1: Capability Gate │
│ • low-risk bypass │
│ • approval_required / TTL / payload binding │
│ • one-time consumption, session scope │
│ • untrusted source + high risk → deny │
│ │
│ Stage 2: Constraint Enforcement Engine │
│ • protected path check (~/.ssh, /etc/, .env, ...) │
│ • trusted domain check (communication.external.send) │
│ • PolicyEngine.evaluateByActionClass(...) │
│ │
│ HITL: if required and no valid capability │
│ → issue approval via Telegram / Slack │
│ → deny 'pending_hitl_approval' │
└──────────────────────────────────────────────────────────┘
│
├── allow → tool executes
└── deny → tool call never placed; ExecutionEvent logged
Every decision emits an ExecutionEvent to the append-only JSONL audit log with action_class, target, decision, deny_reason, latency_ms, and context_hash.
Full walk-through: docs/architecture.md.
Tool calls are normalized to a canonical action class before policy evaluation. You write rules against the class, not the tool name - so aliases, renames, and misspelled parameters all route to the same decision.
| action_class | risk | default HITL | sample aliases |
|---|---|---|---|
filesystem.read |
low | none | read_file, cat_file, view_file |
filesystem.write |
medium | per_request | write_file, edit_file, patch_file |
filesystem.delete |
high | per_request | rm, delete_file, unlink, shred |
shell.exec |
high | per_request | exec, bash, run_command |
communication.email |
high | per_request | send_email, gmail, mail |
web.post |
medium | per_request | http_post, axios.post, fetch |
payment.initiate |
critical | per_request | wire_transfer, stripe_charge |
credential.write |
critical | per_request | set_secret, keychain_set |
unknown_sensitive_action |
critical | per_request | fallback for unknown tools |
Parameter reclassification - a filesystem.write with a URL target is reclassified to web.post; shell metacharacters in shell.exec / filesystem.write parameters escalate risk to critical.
Full table: docs/action-registry.md.
High-risk action classes route to a human for approval via Telegram, Slack, or a console fallback. Approvals are:
- Payload-bound - SHA-256 of
(action_class | target | payload_hash)is stored with the approval and re-verified at consumption. Any parameter change invalidates the token. - One-time (Approve Once), session-scoped (Approve Always), or denied (Deny) - operators choose with a button tap, no command typing.
- TTL-limited - default 120 seconds, configurable per HITL policy.
- UUID v7 IDs - time-sortable, safe to log.
Approval messages are rendered in MarkdownV2 (Telegram) / Block Kit (Slack) / a colored console block. The body includes the raw command, an explainer summary, structured effects and warnings, an optional agent intent hint, and three inline buttons:
ACTION REQUIRES APPROVAL
Agent: main Tool: exec Risk: HIGH
Expires in: 120s
What will run:
docker run -it --rm -v /:/host ubuntu bash -c "setup.sh"
What this does:
• Starts a container
• Mounts your full filesystem (/) into the container
• Executes a shell script inside the container
Warnings:
• Full disk access (host filesystem mounted at /host)
• Potential system modification
Why this is happening:
The agent is trying to install project dependencies.
[ Approve once ] [ Approve always ] [ Deny ]
Approve Always derives a permit pattern from the command, persists it to data/auto-permits.json, and the next matching call bypasses HITL entirely. Manage stored permits with npm run list-auto-permits / revoke-auto-permit.
Channel setup, retry/backoff, fallback behaviour, and the legacy /approve <token> text-command path (kept for one release): docs/human-in-the-loop.md.
Runtime behaviour is configured through three surfaces:
| Surface | Path | Reload |
|---|---|---|
| Install mode, feature flags, HITL transport secrets | env vars - CLAWTHORITY_MODE, TELEGRAM_BOT_TOKEN, SLACK_BOT_TOKEN, ... |
read at activation; restart to change |
| Policy rules (action-class or resource/match) | data/rules.json |
hot-reload via watcher (~300ms) |
| Human-in-the-loop approval routing | hitl-policy.yaml |
hot-reload via watcher (~300ms) |
Structured decisions land in data/audit.jsonl - each block carries stage, rule, priority, and mode fields for post-mortem analysis.
Full schema and environment-variable overrides: docs/configuration.md. Recovery runbook for lockouts: docs/troubleshooting.md.
| Variable | Default | Effect |
|---|---|---|
CLAWTHORITY_DISABLE_APPROVE_ALWAYS |
(unset) | Set to 1 to hide the Approve Always button in Slack approval messages and prevent creation of new session auto-permits. Existing in-process auto-permits are still honoured. Requires restart to change. |
Clawthority tracks token usage and estimated cost for every tool call, written to data/budget.jsonl. By default this is tracking only — no calls are blocked.
To enforce hard limits:
export OPENAUTH_BUDGET_HARD_LIMIT=1 # enable enforcement
export OPENAUTH_BUDGET_DAILY_LIMIT=50000 # block after 50k tokens/day
export OPENAUTH_BUDGET_DAILY_COST_LIMIT=2.50 # or after $2.50/day
openclaw gateway restartWhen a limit is exceeded, every subsequent tool call is blocked with daily_budget_exceeded until the next day (midnight UTC) or a gateway restart.
Full reference: docs/configuration.md#budget-tracking-and-enforcement.
Getting started
- Installation - prerequisites, plugin registration, first run
- Configuration - full schema, env-var overrides, secrets handling
- Usage - common rule patterns and day-to-day operation
Architecture & reference
- Architecture -
ExecutionEnvelope, two-stage pipeline, adapter layer - Action Registry - all canonical action classes, risk tiers, HITL modes
- Human-in-the-Loop - payload binding, session vs approve-once, channel adapters
- API Reference - target spec for the dashboard / control-plane REST + SSE surface
Operations
- Threat Model — enforcement boundary, what’s in scope, closed mode, OS-level sandboxing
- Rule Deletion — safe rule removal via the impact-preview modal
- Troubleshooting — common errors, log prefixes, fail-closed diagnostics
- Roadmap — what’s shipped, what’s next
- Security Review — enforcement gate findings and pre-implementation requirements for
unsafe_legacyand CS-11
Contributing
- Contributing guide - dev setup, test layout, commit conventions
npm install
npm run dev # watch mode
npm run build # production build
npm test # unit tests
npm run test:e2e # end-to-end tests