Skip to content

Commit e586adc

Browse files
committed
docs: add Emergent Design section explaining Runtime Tool Forging + optional HEXACO
The README headline name-drops "Runtime Tool Forging" without explaining what it is or how it works. New section right after Install does the explainer: - Runtime Tool Forging: agent writes a TypeScript function mid-decision, judge LLM approves it, V8 isolate sandbox runs it (128 MB heap, 10s wall clock, no fs/network/eval/dynamic-import). Approved tools land in a discoverable index for reuse so cost flattens after first forges. - Optional HEXACO Personality: opt-in 6-dimensional trait vector. Most deployments never touch it; the runtime stays personality-neutral by default. When passed, biases retrieval / decision routing / tool selection at the kernel level (not in a prompt). - "Why emergent": memory + RTF + (optional) HEXACO compose to produce behavior the prompt did not specify and the developer did not predict. Each capability is documented and configurable; the surprises come from how they compose. Replaces vague headline name-drop with a concrete mechanics walkthrough.
1 parent b44ccb6 commit e586adc

1 file changed

Lines changed: 51 additions & 0 deletions

File tree

README.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,57 @@ await session.send('Can you expand on that?'); // remembers context
4949

5050
---
5151

52+
## Emergent Design: Runtime Tool Forging + Optional HEXACO Personality
53+
54+
Two capabilities that distinguish AgentOS from chat-completion wrappers and from frameworks that hard-code an agent's affordances at startup.
55+
56+
### Runtime Tool Forging
57+
58+
**An agent can write itself a new tool at runtime, get the tool reviewed, and run it inside a sandbox, all in one turn.** That tool then becomes available for the rest of the session and for future agents in the same runtime.
59+
60+
The mechanics, in order:
61+
62+
1. **Detect the gap.** Mid-decision, an agent notices the next step needs a function it doesn't have. (Example from a Mars Genesis run: a security officer agent decides it needs `compute_resource_allocation_under_drought_constraint(state) → priorityList` to make a defensible recommendation.)
63+
2. **Forge.** The agent writes a TypeScript function. The function's input/output is described by a Zod schema; the function body is generated by the LLM from the agent's stated intent.
64+
3. **Judge.** A separate LLM call ("judge") reads the forged function alongside the agent's stated intent and approves or rejects it. Mismatch rejects.
65+
4. **Sandbox-execute.** Approved functions run inside a [V8 isolate](https://github.com/laverdet/isolated-vm) with a 128 MB heap and a 10-second wall clock. No filesystem, no network, no `eval`, no dynamic import. The sandbox is the load-bearing security boundary.
66+
5. **Catalog and reuse.** Approved tools land in a discoverable tool index. Future turns invoke them via `call_forged_tool(name, args)`. Reuse costs tens of tokens. A forge costs full LLM tokens, so cost flattens after the first few turns of a long-running session.
67+
68+
In practice this is the difference between "the agent can do what we wrote handlers for" and "the agent can extend its own capability surface when the task warrants it." See the [emergent-tools post](https://agentos.sh/en/blog/emergent-tools-hexaco-leaders) for the live walkthrough.
69+
70+
### Optional HEXACO Personality
71+
72+
**HEXACO traits are opt-in. Most AgentOS deployments never touch personality and behave personality-neutral.** The runtime works exactly the same with or without a personality vector.
73+
74+
When you do pass a personality vector, the runtime treats it as a structured signal that biases retrieval, decision routing, and tool selection. Same agent, same prompt, same tool set: a high-Openness leader and a high-Conscientiousness leader produce measurably different outcomes because the kernel weights different memories and different tools differently.
75+
76+
```ts
77+
// Personality-neutral (most production agents)
78+
const support = agent({
79+
provider: 'openai',
80+
instructions: 'Resolve customer tickets.',
81+
memory: { types: ['episodic', 'semantic'] },
82+
});
83+
84+
// Optional personality (Paracosm, companion products, multi-agent simulations)
85+
const visionary = agent({
86+
provider: 'openai',
87+
instructions: 'Lead a Mars colony.',
88+
personality: { openness: 0.95, honestyHumility: 0.4 },
89+
memory: { types: ['episodic', 'semantic'] },
90+
});
91+
```
92+
93+
HEXACO covers six factors ([Honesty-Humility, Emotionality, Extraversion, Agreeableness, Conscientiousness, Openness](https://hexaco.org/)). The personality vector is editable, inspectable, and removable on consent. The implementation is in the kernel, not in a prompt; prompt-only personality dissolves under pressure, kernel-encoded personality survives.
94+
95+
### Why "emergent"
96+
97+
Memory + Runtime Tool Forging + (optional) HEXACO produce behavior the prompt did not specify and the developer did not predict in advance. In a Paracosm Mars Genesis run, two leaders with the same starting state, same agent roster, same seed diverge by turn six: one because the personality biased its specialists toward different memories, one because a forged tool from turn two became the obvious next move on turn five.
98+
99+
Nothing about that emergence is mystical. It's the combination of (a) durable memory that survives across turns, (b) a tool surface that can grow within a session, and (c) optional personality biasing the choices among them. Each capability is documented and configurable; the surprises come from how they compose.
100+
101+
---
102+
52103
## Memory Benchmarks at Matched Reader
53104

54105
Same `gpt-4o` reader, same dataset, same `gpt-4o-2024-08-06` judge across every row. Cross-provider configurations are excluded because they cannot be reproduced from public methodology disclosures.

0 commit comments

Comments
 (0)