Learn how modern AI agents are built around the LLM.
The model reasons. The harness gives it action, state, and limits: it runs tools, keeps state across calls, gates side effects, and coordinates loops, none of which a model call does by itself.
This repo explains the harness section by section: loop, tools, memory, permissions, context, tasks, and interfaces. Learn it once and you can read many agents, since a coding tool, chat assistant, and autonomous runner mostly differ in harness choices.
Contents: Systems · Loop · Method · Sections · Structure · Running
Each system is a worked example for the sections below.
| System | Maintainer | License | Models | Surface | Read it for |
|---|---|---|---|---|---|
| Claude Code | Anthropic | Proprietary | Claude only | CLI, IDE, web | Permissions, subagents, skills |
| (more soon) |
More systems can be added later, including Hermes Agent, OpenClaw, aider, and mini-swe-agent.
Most agents share the same control flow: call the model, run requested tools, append results, and call the model again.
flowchart LR
U([User intent]) --> M["messages[]"]
M --> L{{LLM}}
L -->|stop_reason: tool_use| T[Tool runtime]
T --> P{Permitted?}
P -->|deny / ask| M
P -->|allow| X[Execute tool]
X --> R[Tool result] --> M
L -->|stop_reason: end_turn| D([Reply to user])
The loop is small. Most engineering is around it: dispatch tools, gate side effects, manage context, persist state, and coordinate other loops.
Every section is self-contained and uses the same four-part lens:
- Opening. What problem this layer solves.
- Mechanism. The general design and control flow.
- Per system. How real systems implement it.
- Failure modes. What breaks and how to mitigate it.
To learn from this repo:
- Read the sections in order. Each builds on the layer before it.
- For a runnable section, read
src/loop.py, then run itstest.pyanddemo.py. - Diff a section's
src/against the section before it. The diff is the one mechanism that section adds.
Seven layers, from the basic loop to a multi-agent harness. Each row links to one self-contained writeup.
| # | Section | Question | Key mechanisms |
|---|---|---|---|
| Layer 0 · Foundations | |||
| 0 | Harness thesis | Where does agency come from? | Model vs harness, actions, observations, permissions |
| Layer 1 · Core Loop | |||
| 1 | Agent loop | How does an agent keep going? | messages[], loop, stop_reason |
| 2 | Tool runtime | How are tools called and routed? | Registry, schemas, dispatch, deferred search |
| 3 | Permission & sandbox | How are side effects gated? | Permission modes, approvals, sandboxing |
| 4 | Hooks | How do extensions attach to the loop? | PreToolUse, PostToolUse, lifecycle events |
| Layer 2 · Complex Work | |||
| 5 | Planning & todos | How is big work decomposed? | Plan mode, todo list, approval before edits |
| 6 | Subagents | How is a subproblem isolated? | Fresh messages[], delegation, child loop |
| 7 | Skills | How are capabilities loaded on demand? | SKILL.md, catalog, progressive disclosure |
| 8 | Context management | How do long sessions fit the window? | Budgeting, stubs, compaction, summaries |
| Layer 3 · Knowledge & Resilience | |||
| 9 | Memory | How does it remember across runs? | Selection, recall, extraction, consolidation |
| 10 | System prompt assembly | How is the prompt built each turn? | Prompt sections, live state, cache boundaries |
| 11 | Error recovery | How does a long task survive failure? | Retries, overflow recovery, fallback model |
| Layer 4 · Long Running & Async | |||
| 12 | Task system | How does work persist beyond a turn? | Task records, dependencies, locks |
| 13 | Background execution | How does work run off the main loop? | Handles, task state, notification queue |
| 14 | Scheduling | How does an agent run later? | Cron, sleep, remote triggers, queues |
| 15 | Worktree isolation | How does parallel work avoid collisions? | Git worktrees, cwd binding, safe cleanup |
| Layer 5 · Multi Agent | |||
| 16 | Coordination | How do many agents talk? | Inboxes, broadcasts, permission bubbling |
| 17 | Protocols | How do agents agree and stop cleanly? | Plan approval, shutdown handshakes |
| 18 | Autonomy | How do agents organize themselves? | Idle cycle, task claiming, self organization |
| Layer 6 · Extension & Integration | |||
| 19 | MCP / plugins / channels | How does the harness reach the world? | Transports, channels, tool pool assembly |
| 20 | Observability & evaluation | How do we know it works? | Tracing, metrics, evals, failure analysis |
All 21 section writeups are present, from 00-harness-thesis/ through 20-observability/.
awesome-agent-architecture/
├── README.md # top-level map
├── sections/ # one folder per section
│ ├── 00-harness-thesis/ # README.md per section
│ ├── 01-agent-loop/src/ # runnable chain starts here
│ └── 20-observability/
├── systems/ # per-system deep dives
├── patterns/ # shared patterns and failure modes
└── references/ # primary sources and prior art
Each section folder is NN-name/ and contains a README.md.
Sections 1 to 20 also carry a runnable src/. The code accumulates section by section.
Each section adds one mechanism and evolves loop.py, so a diff between adjacent sections shows what changed.
Sections 1 to 20 ship runnable demos. Set up once from the repo root:
uv venv
uv pip install -r requirements.txt
cp .env.example .env # then add your ANTHROPIC_API_KEYPinned dependencies are in requirements.txt. .env is gitignored and holds:
ANTHROPIC_API_KEY- optional
ANTHROPIC_MODEL - optional
ANTHROPIC_BASE_URL
Each runnable section has:
test.py: offline checks, no key needed.demo.py: live demo against the API.
python sections/01-agent-loop/src/test.py # offline
uv run python sections/01-agent-loop/src/demo.py # live- Add a system. Slot a new agent into the same section structure.
- Deepen a section. Add a mechanism, clearer diagram, or sharper failure mode.
- Correct the record. These are reconstructions from source, docs, and behavior. Sourced corrections are welcome.
Favor named, verifiable mechanisms over speculation. Cite sources.
| Source | What it offers |
|---|---|
| claude-code | Claude Code source backup used for mechanism names and implementation paths. |
| learn-claude-code | Code-first harness reconstruction and section framing. |
