Bridle

A runtime control plane for production AI agents. Bridle sits between the agent and its consequential actions — the LLM call, the tool call, the refund — and enforces policy before anything happens that costs money, leaks data, or violates a rule.

One sentence: a central place to define what an AI agent is allowed to do, distributed as signed policy to the gateway and tool middleware that actually run the agent.

Project site → · Hosted demo → (login: demo / demo) · Releases →

Quickstart (local) — 60 seconds

docker compose up -d postgres
make pilot-demo

The pilot demo walks YAML → bundle → gateway → traffic → mode flip → audit → fail-mode proofs and prints what it proves at each phase.

Open the operator console locally:

uvicorn bridle.cp_server.app:app --host 127.0.0.1 --port 8200
open http://127.0.0.1:8200/console/fleet

Hosted demo

URL	What	Auth
`https://www.bridle.cloud`	Project site	public
`https://demo.bridle.cloud`	Operator console	basic-auth — `demo` / `demo`
`https://gateway.bridle.cloud`	LiteLLM gateway (OpenRouter)	`Authorization: Bearer <key>` (request via contact@bridle.cloud)

Architecture

flowchart TB
    subgraph CP["Bridle Control Plane (bridle/cp_server)"]
        direction TB
        Endpoints["FastAPI endpoints<br/>POST /v1/bundles<br/>POST /v1/policies/{id}/mode<br/>POST /v1/policies/{id}/canary<br/>POST /v1/audit<br/>GET /v1/reports/{shadow,fleet,pilot,pilot-decision}"]
        Signer["ed25519 KeyManager"]
        Validator["bundle_validator"]
        Console["/console/* (Jinja2 + HTMX)"]
        Endpoints --> Signer
        Endpoints --> Validator
        Console -. reads .-> Endpoints
    end

    subgraph PG["Postgres"]
        T1[(audit_rows)]
        T2[(sessions)]
        T3[(tool_intents)]
        T4[(policy_bundles)]
        T5[(gateway_registry)]
        T6[(agents)]
    end

    subgraph GW["Gateway process"]
        direction TB
        LiteLLM["LiteLLM Proxy<br/>+ BridleLogger #[0]"]
        Loader["HTTPBundleLoader<br/>(poll, verify sig,<br/>refuse expired,<br/>swap engine cfg)"]
        Interceptor["GatewayInterceptor<br/>(LLM + tool surfaces)"]
        Engine["LocalDemoPolicyEngine<br/>(6 policy types,<br/>per-rule targeting,<br/>canary overrides)"]
        ToolDecorator["@bridle.tool<br/>+ session_context"]
        LiteLLM --> Interceptor
        ToolDecorator --> Interceptor
        Interceptor --> Engine
        Loader --> Engine
    end

    Provider["LLM upstream<br/>(OpenRouter / OpenAI / Anthropic / mock)"]
    Operator["Operator"]
    YAML["policy.yaml"]

    Operator -- "edit" --> YAML
    YAML -- "bridle policy publish" --> Endpoints
    Operator -- "console" --> Console
    Endpoints -- "store signed" --> T4
    Loader -- "GET /v1/bundles/active" --> Endpoints
    Endpoints -- "lookup" --> T4
    Endpoints -- "lookup" --> T5
    Endpoints -- "lookup" --> T6

    Interceptor -- "writes audit" --> T1
    Interceptor -- "reads/updates" --> T2
    Interceptor -- "reads/updates" --> T3
    Interceptor -- "ships audit" --> Endpoints
    LiteLLM -- "calls" --> Provider

The flow in eight lines:

Operator writes a YAML policy and runs bridle policy publish (or uses the console).
CP server validates the bundle, signs it with the CP's ed25519 key, persists it in Postgres.
Gateway (LiteLLM Proxy + BridleLogger) polls the CP, verifies the signature, swaps the active bundle on the running policy engine.
Agent makes an LLM call. BridleLogger.async_pre_call_hook fires; the interceptor builds an observation, evaluates policy, decides allow / mutate / block.
The same GatewayInterceptor instance is used by @bridle.tool, so tool calls are governed by the same engine, same state, same audit ledger.
Every decision is written to Postgres audit_rows with a hash chain.
Operator runs bridle report fleet, bridle report pilot-decision, bridle report trace … to query that audit — or browses the same data in the console.
Operator flips a policy via bridle policy mode (or canary) — the gateway activates the new bundle within one polling interval, no restart.

A policy, end to end

id: session-budget
version: v1
type: session_budget
mode: shadow
severity: medium

target:
  environments: [production]
  agent_groups: [pilot]
  risk_tiers:   [low, medium]

# Per-agent enforce canary; rest of the fleet stays shadow.
canaries:
  - agents: [support_summarizer]
    mode: enforce

config:
  session_budget_usd: 0.50
  downgrade_at_ratio: 0.8
  downgrade_to_model: mock-model-cheap

fail_modes:
  on_engine_error:      fail_open
  on_bundle_expired:    use_cached_policy
  on_state_unavailable: fail_open

bridle policy publish examples/policies/fleet/session-budget.yaml \
  --tenant my-tenant --bundle-id b-2026-05 --gateway-id gw-prod

Six supported policies

YAML `type:`	Surface	Effect
`model_allowlist`	LLM	Block invocations not in the list
`session_budget`	LLM	Downgrade at threshold, block over budget
`pii_outbound`	LLM	Block or redact PII
`tool_allowlist`	Tool	Deny tool calls outside an allowlist
`refund_threshold_approval`	Tool	Require approval for refund tool calls above an amount
`tool_loop_detector`	Tool	Block on repeated same-tool/same-args bursts

Each policy supports per-rule targeting (agents, agent_groups, risk_tiers, environments) and per-agent canary overrides. See examples/policies/.

Performance

Two numbers matter, and they say two different things:

Plugin overhead — measured against a mock upstream, 2 000 calls. This is the cost Bridle itself adds on top of LiteLLM:

Config	p50	p95	p99
LiteLLM Proxy + BridleLogger (real)	5.10 ms	7.29 ms	9.48 ms
LiteLLM Proxy, no callbacks	5.34 ms	7.74 ms	9.20 ms

The BridleLogger callback adds zero ms within measurement noise. What you pay for is the LiteLLM proxy itself (~5 ms p50, ~7 ms p95). The 25 ms p95 policy-decision budget has ~17 ms headroom for the actual policy work (allowlist check, session budget, PII regex, state lookup).

End-to-end through OpenRouter — measured live against the hosted gateway, n = 10 each, gpt-4o-mini. This is what a client sees:

Path	median	min	max
Direct OpenRouter	680 ms	502 ms	2149 ms
Through Bridle (in-container)	838 ms	392 ms	1055 ms
Through Bridle (public TLS edge)	791 ms	420 ms	1353 ms

The honest read: the median wall-clock difference (~100–160 ms) is dominated by OpenRouter's own response-time variance (500–2100 ms), not by Bridle. Tail latency is actually better through Bridle (1.05–1.35 s vs 2.15 s direct) because LLM-side tail spikes are the bigger factor.

We're not satisfied with the median yet — we're working on shrinking the LiteLLM proxy step (which is ~75 % of the added time) and on moving the audit shipper off the request path.

Two safety valves

BRIDLE_FORCE_SHADOW=true — every enforce decision is demoted to shadow at the gateway. Audit rows record both the configured mode and the effective mode. The console renders an amber banner when this is on.

BRIDLE_BYPASS=true — the plugin short-circuits every hook. No observation, no decision, no audit. Tactical break-glass only. The console renders a red banner when this is on.

CLI

bridle policy compile  FILES…  --tenant T --bundle-id B
bridle policy publish  FILES…  --tenant T --bundle-id B --gateway-id G
bridle policy mode     --policy P --to {shadow|enforce} --tenant T
bridle policy canary   --policy P --agent A --mode {shadow|enforce} --tenant T

bridle agents register --agent-id A --tenant T --owner O --risk-level L --groups G
bridle agents list     --tenant T
bridle agents show     A

bridle report shadow         --tenant T
bridle report fleet          --tenant T --since-days 7
bridle report pilot          --tenant T --since-days 7
bridle report pilot-decision --tenant T --since-days 7
bridle report trace          --tenant T --trace-id X

bridle gateway register --gateway-id G --tenant T --models a,b
bridle gateway status   --gateway-id G

bridle health

Full reference: bridle <subcommand> --help.

Run the tests

make pg-up           # Postgres in docker for the durability tests
make test            # 120 product tests
make test-postgres   # just the Postgres-backed durability tests

# Optional: the LiteLLM 1.86.0 spike regression suite
bash tests/spikes/litellm_enforcement/run_spike.sh

Docs

Where	What
`docs/architecture/overview.md`	~5 min overview of the components
`docs/architecture/diagram.md`	Mermaid diagrams
`docs/adr/`	ADR-001 … ADR-009 — decision records per milestone
`docs/releases/`	Release notes per tag
`docs/pilot/`	Pilot operator docs (rollout, rollback, console)

License

MIT.

Status

Current tag: v0.11-public-demo. Past tags v0.5 through v0.10 are checkpoint references for the path.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
bridle		bridle
docs		docs
examples/policies		examples/policies
site		site
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bridle

Quickstart (local) — 60 seconds

Hosted demo

Architecture

A policy, end to end

Six supported policies

Performance

Two safety valves

CLI

Run the tests

Docs

License

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Bridle

Quickstart (local) — 60 seconds

Hosted demo

Architecture

A policy, end to end

Six supported policies

Performance

Two safety valves

CLI

Run the tests

Docs

License

Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages