-
Notifications
You must be signed in to change notification settings - Fork 0
Manifesto: Agents Need FreeBSD
Toward an Agent Operating System
If you squint, the agent ecosystem in 2026 looks exactly like Linux in 1998.
Dozens of frameworks, each with its own incompatible state format, its own orchestration semantics, its own idea of what a "tool" or a "task" is. Every one of them ships a kernel, a package manager, an init system, and a desktop environment — fused into a single opinionated blob. Migrating between them means rewriting everything. Debugging means reading someone else's abstraction. And when an agent run goes wrong at 2 a.m., the answer to "what actually happened?" is a scroll through chat logs and a prayer.
We have framework sprawl. What we don't have is a base system.
This essay argues that the agent world needs what the Unix world got from FreeBSD: a small, complete, coherent, ruthlessly documented base system — built around explicit state, enforced isolation, and verified commits — on top of which everything else is userland.
I'm building one. It's called Cool Workflow (CW). But the argument matters more than the implementation, so let's start there.
The dominant mental model today treats an agent task as one long prompt: stuff context in, let the model loop, hope the result is right. Frameworks decorate this loop with graphs and callbacks, but the fundamental posture is the same — the work lives in ephemeral conversation state, success is inferred from vibes, and the only audit trail is whatever the framework's hosted dashboard decided to log.
This is how we ran programs before operating systems: load it, run it, look at the lights.
An OS made computation trustworthy by making it durable, inspectable, and isolated. Processes have IDs, state, and a parent. You can ps them, kill them, trace them. Files have permissions. Crashes leave cores. None of this made programs smarter — it made them governable.
Agent work needs the same treatment, and it needs it more urgently than ordinary programs did, because agents are non-deterministic and increasingly act with real-world authority. The questions an operator must be able to answer are OS questions:
- What ran, when, under whose authority?
- What was it allowed to touch, and was that enforced?
- What evidence supports the result, and what alternative did it beat?
- What state was committed, and who verified it?
No prompt-loop framework answers these structurally. A runtime can.
Linux won on ubiquity. But FreeBSD's engineering philosophy — not its market share — is the right blueprint for an agent OS, for four reasons.
1. The base system is one artifact. FreeBSD's kernel, userland, and documentation are developed, versioned, and released together. There is no "which distro" question; there is one coherent system with one source of truth. An agent base system should make the same promise: the run-state format, the scheduler semantics, the isolation contract, and the docs evolve in lockstep, and a release is a verified snapshot of all of it.
2. Jails. FreeBSD invented OS-level containment in 1999 — a quarter century before "sandbox your agents" became a panicked blog-post genre. A jail is not a suggestion; a jailed process cannot see outside its boundary, by construction. Agents need exactly this: an agent jail is a declared profile — filesystem view, network policy, command allowlist, environment — that the runtime enforces, not merely records. The current norm, where "sandboxing" means a paragraph in the system prompt, would be laughed out of any operating-systems venue. Policy that isn't enforced is decoration.
3. POLA and ABI discipline. FreeBSD's Principle of Least Astonishment, and its compatibility guarantees across -STABLE branches, are why people run it for decades. An agent runtime should treat its on-disk state as an ABI: versioned schemas, explicit deprecation windows, replay compatibility across releases. Your run records from last year should still parse, still replay, still audit. If a "framework update" can silently orphan your audit history, you never had an audit history.
4. Release engineering as a feature. FreeBSD ships when the release checklist passes, not when the marketing calendar says so. An agent base system should gate every release on a deterministic harness: build, type-check, test, golden-path replay, fixture compatibility, docs/version sync — and the gate should run on the system itself. Dogfooding isn't a virtue here; it's the minimum bar for asking anyone to trust you with their agents.
One idea, repeated at every layer:
plan -> dispatch -> record evidence -> verify -> verifier-gated commit -> report
Kernel: explicit state, no magic. Every run is plain data on disk — readable, diffable, resumable, replayable. No hidden dashboard database. The runtime never infers success; ambiguity is a first-class, visible state that fails closed to "unexplained" rather than fabricating a reason. If you cannot cat your agent's state, you do not own your agent.
Processes: workers with manifests. Subtasks are dispatched as explicit manifests — inputs, sandbox profile, expected artifacts — and return result envelopes with provenance. Multi-agent coordination is a process table, not a metaphor: roles, memberships, a shared blackboard, and reusable topologies (map-reduce, debate, judge panels) recorded as ordinary state with policy and audit attached.
Jails: enforced isolation. Sandbox profiles are contracts the runtime enforces. A read-only worker that attempts a write is killed, and the violation is itself evidence. Containment composes with everything else: jailed workers produce jailed evidence, and the verifier knows the difference.
Evidence over vibes. Results carry a reasoning chain: what was adopted, on what basis, under whose authority, and the counterfactual it beat. This is the part that sounds academic until the first time an agent-produced change ships to production and someone asks why. In regulated industries, this isn't a feature — it's the precondition for agents existing at all.
Verifier-gated commits. Unverified state never becomes committed state. The verifier is deterministic and replayable; "the model said it was done" is not verification. Checkpoints are named, diffs are inspectable, and rollback is a read of the run log, not an archaeology project.
Ports: workflow apps as userland. On top of the base, applications — an architecture review, a PR-review-and-fix pipeline, a release cut — are versioned manifests with validated inputs, phases, and artifacts. They are installed, validated, and audited through one mechanism, the way ports flow through one tree. The base system stays small; the ecosystem grows in userland, where it belongs.
Man pages. Every contract is documented like a syscall, because for an agent host, it is one. Documentation that drifts from behavior fails the release gate.
FreeBSD's base famously contains no X11. Knowing what you will never ship is half the architecture. An agent base system should commit, in writing, to a Never list:
- No model calls in base. The runtime schedules, records, and verifies; the agent host runs the workers and owns inference. The kernel does not have opinions about which LLM you use, any more than FreeBSD has opinions about which compiler optimized your binary.
- No prompt management, no routing, no "AI features." Userland problems.
- No hosted dashboard as the source of truth. UIs are read-only views over the on-disk state. If the company dies, your audit trail doesn't.
-
No inferred success, ever. The runtime would rather report
unexplainedthan guess.
A base system earns trust by being boring in exactly the right places.
"Frameworks already do observability." They do logging. Observability without enforced isolation and gated commits is a flight recorder bolted to a plane with no hydraulics — you get a beautiful record of the crash.
"This adds friction; agents are supposed to be autonomous." Operating systems added "friction" too: memory protection, permissions, process boundaries. That friction is why you can run untrusted code at all. Autonomy without containment isn't autonomy; it's exposure.
"BSD lost. Why copy the loser?" FreeBSD lost the desktop and quietly won the parts of the internet where correctness pays: Netflix's CDN, the network stacks inside products you use daily. The agent equivalent — finance, healthcare, infrastructure, anywhere "what did the agent do and can you prove it" is a legal question — is precisely the market that cannot run on vibes. Small, complete, and trusted beats large and incoherent in every domain where failure is expensive. Agents are becoming such a domain at speed.
Cool Workflow is my working implementation of this argument: a small TypeScript/Node base system — CLI and MCP front doors over one shared kernel — with durable plain-JSON runs, evidence chains, candidate scoring, verifier-gated commits, multi-agent topologies, a deterministic eval/replay harness, and a release gate that dogfoods on its own repository. Sandbox profiles exist today as enforced-by-contract; making them enforced-by-construction — real agent jails — is the current line of work, because it's the claim everything else leans on.
It is BSD-licensed, single-maintainer, and deliberately small: docs written as man pages. It will stay small. That's the point.
The design philosophy fits on an index card:
Small kernel. Explicit state. Composable pipes.
Isolated workers. Verifier-gated commits. Docs as man pages.
If you've ever filed a FreeBSD PR, maintained a port, or just believe that the answer to "what did the agent do?" should be a file you can read rather than a dashboard you must trust — the tree is here, and the Handbook is being written.
The frameworks are fighting the distro wars. Somebody has to build the base system.
Discussion welcome. Especially disagreement.
Organized from local Obsidian notes and reconciled with the current
coo1white/cool-workflow repository state.
Start here
Go deeper
- Workflow Apps
- Architecture
- Trust And Audit
- Recovery And Restore
- Commands or API
- MCP And Manifests
- Operations
- FAQ
Source docs