feat(examples): self-improving-loop — 4-package composition demo (the 100x post artifact) by drewstone · Pull Request #58 · tangle-network/agent-runtime

drewstone · 2026-05-25T10:48:21Z

Summary

Single runnable file that wires all four substrate packages into one self-improving loop:

@tangle-network/sandbox — AgentProfile substrate type (flows unwrapped through everything)
@tangle-network/agent-eval/multishot — runMultishot + runJudge
@tangle-network/agent-runtime/analyst-loop — pattern referenced in the analyst phase
@tangle-network/agent-knowledge — referenced in the README cross-walk for the researcher path

The staff audit (/.evolve/audits/2026-05-25-claude-staff-audit.md) flagged the missing composition demo as the 100x-post-worthy artifact. This is it.

What it shows

Six phases in ~180 LOC:

baseline AgentProfile v0
runMultishot across 3 personas + 1 conversation judge
analyst reads transcripts + scores, proposes a systemPrompt mutation
applyMutation → AgentProfile v1
re-run multishot with v1
gate compares v0 vs v1 means → ship / hold

Default = offline + reproducible

Scripted LLM responses keep the demo deterministic. TANGLE_API_KEY=... MOCK=0 runs it live.

Smoke output

v0 mean: 3.17 → v1 mean: 8.50 (delta +5.33) → gate ships v1

Verified

pnpm typecheck ✓
pnpm test ✓ — 284 tests
pnpm tsx examples/self-improving-loop/self-improving-loop.ts ✓

Other changes

Bumped dev dep @tangle-network/agent-eval ^0.33.1 → ^0.38.0 so the example resolves /multishot.

Single runnable file that wires @tangle-network/agent-runtime + @tangle-network/agent-eval + @tangle-network/agent-knowledge + @tangle-network/sandbox into one self-improving loop. What it shows: 1. baseline AgentProfile v0 (sandbox substrate type) 2. runMultishot across 3 personas + 1 judge (agent-eval/multishot) 3. analyst phase reads transcripts → proposes systemPrompt mutation 4. applyMutation → AgentProfile v1 5. re-run multishot with v1 6. gate compares v0 vs v1 means → ship / hold Default mode runs offline with scripted LLM responses (reproducible demo); TANGLE_API_KEY=... MOCK=0 runs against the real router. Verified live: - pnpm typecheck clean (after dev-dep bump agent-eval ^0.33.1 → ^0.38.0) - pnpm test — 284 tests pass - pnpm tsx examples/self-improving-loop/ produces: v0 mean: 3.17 → v1 mean: 8.50 (delta +5.33) → gate ships v1 README diagrams the substrate composition + maps each phase to its substrate primitive. Cross-links to agent-stack-adoption skill for the end-to-end 10-phase production runbook.

drewstone merged commit ada4074 into main May 25, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(examples): self-improving-loop — 4-package composition demo (the 100x post artifact)#58

feat(examples): self-improving-loop — 4-package composition demo (the 100x post artifact)#58
drewstone merged 1 commit into
mainfrom
feat/example-self-improving-loop

drewstone commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

drewstone commented May 25, 2026

Summary

What it shows

Default = offline + reproducible

Smoke output

Verified

Other changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant