docs: add Emergent Design section explaining Runtime Tool Forging + optional HEXACO

jddunn · jddunn · commit e586adc4a0c8 · 2026-04-30T06:57:43.000-07:00
The README headline name-drops "Runtime Tool Forging" without explaining
what it is or how it works. New section right after Install does the
explainer:

- Runtime Tool Forging: agent writes a TypeScript function mid-decision,
  judge LLM approves it, V8 isolate sandbox runs it (128 MB heap, 10s
  wall clock, no fs/network/eval/dynamic-import). Approved tools land
  in a discoverable index for reuse so cost flattens after first forges.

- Optional HEXACO Personality: opt-in 6-dimensional trait vector. Most
  deployments never touch it; the runtime stays personality-neutral by
  default. When passed, biases retrieval / decision routing / tool
  selection at the kernel level (not in a prompt).

- "Why emergent": memory + RTF + (optional) HEXACO compose to produce
  behavior the prompt did not specify and the developer did not
  predict. Each capability is documented and configurable; the
  surprises come from how they compose.

Replaces vague headline name-drop with a concrete mechanics walkthrough.
diff --git a/README.md b/README.md
@@ -49,6 +49,57 @@ await session.send('Can you expand on that?'); // remembers context
 
 ---
 
+## Emergent Design: Runtime Tool Forging + Optional HEXACO Personality
+
+Two capabilities that distinguish AgentOS from chat-completion wrappers and from frameworks that hard-code an agent's affordances at startup.
+
+### Runtime Tool Forging
+
+**An agent can write itself a new tool at runtime, get the tool reviewed, and run it inside a sandbox, all in one turn.** That tool then becomes available for the rest of the session and for future agents in the same runtime.
+
+The mechanics, in order:
+
+1. **Detect the gap.** Mid-decision, an agent notices the next step needs a function it doesn't have. (Example from a Mars Genesis run: a security officer agent decides it needs `compute_resource_allocation_under_drought_constraint(state) → priorityList` to make a defensible recommendation.)
+2. **Forge.** The agent writes a TypeScript function. The function's input/output is described by a Zod schema; the function body is generated by the LLM from the agent's stated intent.
+3. **Judge.** A separate LLM call ("judge") reads the forged function alongside the agent's stated intent and approves or rejects it. Mismatch rejects.
+4. **Sandbox-execute.** Approved functions run inside a [V8 isolate](https://github.com/laverdet/isolated-vm) with a 128 MB heap and a 10-second wall clock. No filesystem, no network, no `eval`, no dynamic import. The sandbox is the load-bearing security boundary.
+5. **Catalog and reuse.** Approved tools land in a discoverable tool index. Future turns invoke them via `call_forged_tool(name, args)`. Reuse costs tens of tokens. A forge costs full LLM tokens, so cost flattens after the first few turns of a long-running session.
+
+In practice this is the difference between "the agent can do what we wrote handlers for" and "the agent can extend its own capability surface when the task warrants it." See the [emergent-tools post](https://agentos.sh/en/blog/emergent-tools-hexaco-leaders) for the live walkthrough.
+
+### Optional HEXACO Personality
+
+**HEXACO traits are opt-in. Most AgentOS deployments never touch personality and behave personality-neutral.** The runtime works exactly the same with or without a personality vector.
+
+When you do pass a personality vector, the runtime treats it as a structured signal that biases retrieval, decision routing, and tool selection. Same agent, same prompt, same tool set: a high-Openness leader and a high-Conscientiousness leader produce measurably different outcomes because the kernel weights different memories and different tools differently.
+
+```ts
+// Personality-neutral (most production agents)
+const support = agent({
+  provider: 'openai',
+  instructions: 'Resolve customer tickets.',
+  memory: { types: ['episodic', 'semantic'] },
+});
+
+// Optional personality (Paracosm, companion products, multi-agent simulations)
+const visionary = agent({
+  provider: 'openai',
+  instructions: 'Lead a Mars colony.',
+  personality: { openness: 0.95, honestyHumility: 0.4 },
+  memory: { types: ['episodic', 'semantic'] },
+});
+```
+
+HEXACO covers six factors ([Honesty-Humility, Emotionality, Extraversion, Agreeableness, Conscientiousness, Openness](https://hexaco.org/)). The personality vector is editable, inspectable, and removable on consent. The implementation is in the kernel, not in a prompt; prompt-only personality dissolves under pressure, kernel-encoded personality survives.
+
+### Why "emergent"
+
+Memory + Runtime Tool Forging + (optional) HEXACO produce behavior the prompt did not specify and the developer did not predict in advance. In a Paracosm Mars Genesis run, two leaders with the same starting state, same agent roster, same seed diverge by turn six: one because the personality biased its specialists toward different memories, one because a forged tool from turn two became the obvious next move on turn five.
+
+Nothing about that emergence is mystical. It's the combination of (a) durable memory that survives across turns, (b) a tool surface that can grow within a session, and (c) optional personality biasing the choices among them. Each capability is documented and configurable; the surprises come from how they compose.
+
+---
+
 ## Memory Benchmarks at Matched Reader
 
 Same `gpt-4o` reader, same dataset, same `gpt-4o-2024-08-06` judge across every row. Cross-provider configurations are excluded because they cannot be reproduced from public methodology disclosures.