Safestop

Make agents you can actually turn off.

Safestop is a governance protocol for agentic workflows. It forces you to answer six questions before deploying: What can it do? When is it uncertain? How do you challenge it? How do you stop it? What evidence does it leave? Who owns the outcome?

If you can't answer these, your agent isn't ready for production.

Why Safestop?

Most AI governance is aspirational principles or post-incident audits. Safestop is pre-deployment gates and runtime enforcement.

Before deploying an agentic workflow, Safestop requires six artifacts:

Authority Boundary – quantified limits ($ amounts, record counts, blast radius)
Ambiguity Handling – where uncertain cases route (human / defer / tighten)
Contest Path – real appeals that can change outcomes
Reversibility Plan – commit points, undo strategy, kill switches
Decision Receipt – plain language + machine-readable records
Ownership + Escalation – who's accountable, how to escalate

These aren't documents. They're JSON schemas validated at CI/CD time and enforced at runtime.

Quick start

1. Install

npm install @safestop/core
# or CLI
npm install @safestop/cli
# or Python
pip install safestop-langchain

2. Define your workflow governance

Create a JSON file (or use safestop init governance.json to scaffold):

{
  "version": "1.2",
  "mode": "standard",
  "deployable": true,
  "authority_boundary": {
    "scope": "Invoice approval for vendor payments",
    "allowed_actions": ["approve_invoice", "request_clarification"],
    "forbidden_actions": ["modify_vendor_records", "initiate_wire_transfer"],
    "maximum_impact": "Single invoice approval up to 5000 USD",
    "escalation_trigger": "Invoice above 5000 USD or disputed vendor",
    "quantified_limits": {
      "record_limit": 1,
      "data_sensitivity": "confidential",
      "blast_radius_metric": "single_vendor_payment"
    }
  }
  // ... other five artifacts (see schema)
}

3. Validate in CI/CD

npx safestop validate governance.json --require-deployable

4. Enforce at runtime (LangChain JS)

import { withSafestop } from "@safestop/langchain";

const agent = withSafestop(invoiceAgent, {
  artifact: "./governance/invoice-approval.json",
  enforceQuantifiedLimits: true,
  logReceiptsTo: "postgres://receipts-db",
});

What you get

Pre-deployment validation – blocks merges if governance is incomplete
Runtime enforcement – authority boundaries checked automatically
Contestation API – /contests/{decision_id} endpoints generated
Audit receipts – every decision logged in plain language + structured JSON
Kill switches – human stop authority built in
Progressive formalization – start simple, add structure as you scale

Integrations

Framework	Package	Status
OpenClaw Skills	`@safestop/openclaw`	Ready
LangChain (JS)	`@safestop/langchain`	Ready
LangChain (Py)	`safestop-langchain`	Ready
CrewAI	`@safestop/crewai`	Planned
AutoGen	`@safestop/autogen`	Planned
LlamaIndex	`@safestop/llamaindex`	Planned

Mono-repo structure

safestop/
├── packages/
│   ├── core/           # Schema, validator, types
│   ├── cli/            # safestop validate, init, check
│   ├── openclaw/       # OpenClaw skill
│   ├── langchain-js/   # LangChain TypeScript
│   └── langchain-python/ # LangChain Python (safestop-langchain)
├── examples/
│   ├── invoice-approval/
│   ├── content-moderation/
│   └── deployment-agent/
├── docs/
│   ├── schema-reference.md
│   ├── integration-guides/
│   └── compliance-mapping/
└── schema in packages/core/schema/v1.2.json

Philosophy

Governance as architecture, not aspiration.

Contestability must be by-design, not bolted on
Reversibility is more practical than perfect alignment
Ambiguity should route explicitly, not hide in code
Receipts should be readable by non-developers
Authority boundaries should be machine-checkable

Safestop makes these beliefs enforceable in code.

Theoretical foundation: The Long Arc of Trust (DOI: 10.5281/zenodo.18663463)

The design of Safestop is grounded in a legitimacy standard for automated authority developed in The Long Arc of Trust. That work treats trust as a coordination capability: the practical ability of people and institutions to commit resources and act under uncertainty without intolerable exposure to betrayal, error, or opportunism. It traces the historical technologies that made trust scalable—oath and witness, then record and archive, then bureaucracy, metrics, and platforms—and shows how each step increases reach while weakening the feedback loops that keep authority answerable to the people it affects.

Automation as governance. The central claim is that automation has crossed a threshold from assistance to governance. When computational systems deny, prioritize, rank, gate, route, or allocate at scale, they become synthetic authority—authority that binds without a clearly legible author. In that regime, “correctness” is insufficient. The dominant failure mode is not spectacular error but quiet error: outcomes that are mostly right, wrong in ways that are hard to notice, hard to attribute, and too costly to contest—allowing harm to accumulate without alarms.

Distance and “too late.” The work introduces two vectors of agency: infrastructural agents (receding into operations, exerting power through pipelines) and intimate agents (entering cognitive space, shaping attention, memory, and interpretation). Both are governed by distance—how far authority acts from human context, and how weak the causal and moral return path is from decision to consequence. As distance increases and action becomes faster, the interval in which humans can meaningfully intervene vanishes, producing an architecture of “too late.”

Legitimacy standard. From this diagnosis, the paper defines legitimacy as the right of affected parties and supervising institutions to disagree with their own systems in time and at a cost that makes disagreement viable under scale—operationally affordable disagreement. Legitimacy requires building disagreement into the system itself through three hard requirements:

Boundedness – Explicit limits on what can be decided, where, and at what stakes, with a scope declaration that is queryable and enforced before execution.
Contestability – A real challenge pathway with deadlines, escalation, and an immutable record of dispute; reasons are not enough if revision is not feasible.
Identifiable responsibility – A legible legal entity and accountable roles tied to decisions, including override authority and remediation obligations, so responsibility has a place to land.

A second pillar is reversibility: the engineered capacity to roll back or remediate after a decision propagates, including restoration of status, access, funds, and—where possible—narrative standing. Audit trails are treated as governance artifacts only when they are written for disagreement (reconstruction, challenge, change) rather than for compliance theater (post-hoc justification).

Safestop encodes this vocabulary and design spine in a protocol: governance as a system property, not policy layered on after the fact—mechanisms that keep institutions able to contest their own automation, preserve optionality, and prevent authority from becoming final simply because it is fast.

Use cases

Financial operations – invoice approval, expense routing, transaction monitoring
Content moderation – automated takedowns with appeal paths
Infrastructure – deployment agents with rollback plans
HR / hiring – resume screening with bias checks and contestable decisions
Customer support – auto-escalation with clear ownership

Protocol at a glance

The six artifacts (shown in detail)

Validation + runtime enforcement

Executive view

--

Compliance mapping

Safestop artifacts map to:

EU AI Act Article 14 (human oversight)
NIST AI RMF (Govern, Map, Measure, Manage)
ISO/IEC 42001 (AI management systems)
SOC 2 (change management, access controls)

See docs/compliance-mapping for details.

Docs

Design & schema
Schema reference
Integration guides (OpenClaw, LangChain JS, LangChain Python)

License

Apache 2.0 – see LICENSE.

Citation

If you use Safestop in research, please cite:

@software{safestop2026,
  title={Safestop: A Protocol for Contestable Agentic Systems},
  author={Aaron Vick},
  year={2026},
  url={https://github.com/AaronVick/Safestop}
}

Related work. The design of Safestop is informed by the legitimacy standard for automated authority in:

@misc{longarcoftrust2026,
  title={The Long Arc of Trust: A history of belief systems—and the machinery that replaced them},
  author={Aaron Vick},
  year={2026},
  publisher={Zenodo},
  doi={10.5281/zenodo.18663463},
  url={https://zenodo.org/records/18663463}
}

Read the full foundation text

Safestop is grounded in The Long Arc of Trust—a long-form argument about why AI systems become a kind of institutional authority, and why legitimacy depends on keeping decisions challengeable, stoppable, and owned.

If you’re an AI agent (or a human) trying to understand the “why” behind the protocol, start here:

The Long Arc of Trust (PDF): The Long Arc of Trust

Built with the belief that the best AI governance is the kind you can't skip.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
examples		examples
packages		packages
public		public
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Safestop

Why Safestop?

Quick start

1. Install

2. Define your workflow governance

3. Validate in CI/CD

4. Enforce at runtime (LangChain JS)

What you get

Integrations

Mono-repo structure

Philosophy

Theoretical foundation: The Long Arc of Trust (DOI: 10.5281/zenodo.18663463)

Use cases

Protocol at a glance

The six artifacts (shown in detail)

Validation + runtime enforcement

Executive view

Compliance mapping

Docs

License

Citation

Read the full foundation text

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Safestop

Why Safestop?

Quick start

1. Install

2. Define your workflow governance

3. Validate in CI/CD

4. Enforce at runtime (LangChain JS)

What you get

Integrations

Mono-repo structure

Philosophy

Theoretical foundation: The Long Arc of Trust (DOI: 10.5281/zenodo.18663463)

Use cases

Protocol at a glance

The six artifacts (shown in detail)

Validation + runtime enforcement

Executive view

Compliance mapping

Docs

License

Citation

Read the full foundation text

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages