feat(examples): self-improving-loop — 4-package composition demo (the 100x post artifact)#58
Merged
Merged
Conversation
Single runnable file that wires @tangle-network/agent-runtime +
@tangle-network/agent-eval + @tangle-network/agent-knowledge +
@tangle-network/sandbox into one self-improving loop.
What it shows:
1. baseline AgentProfile v0 (sandbox substrate type)
2. runMultishot across 3 personas + 1 judge (agent-eval/multishot)
3. analyst phase reads transcripts → proposes systemPrompt mutation
4. applyMutation → AgentProfile v1
5. re-run multishot with v1
6. gate compares v0 vs v1 means → ship / hold
Default mode runs offline with scripted LLM responses (reproducible demo);
TANGLE_API_KEY=... MOCK=0 runs against the real router.
Verified live:
- pnpm typecheck clean (after dev-dep bump agent-eval ^0.33.1 → ^0.38.0)
- pnpm test — 284 tests pass
- pnpm tsx examples/self-improving-loop/ produces:
v0 mean: 3.17 → v1 mean: 8.50 (delta +5.33) → gate ships v1
README diagrams the substrate composition + maps each phase to its
substrate primitive. Cross-links to agent-stack-adoption skill for the
end-to-end 10-phase production runbook.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Single runnable file that wires all four substrate packages into one self-improving loop:
@tangle-network/sandbox—AgentProfilesubstrate type (flows unwrapped through everything)@tangle-network/agent-eval/multishot—runMultishot+runJudge@tangle-network/agent-runtime/analyst-loop— pattern referenced in the analyst phase@tangle-network/agent-knowledge— referenced in the README cross-walk for the researcher pathThe staff audit (
/.evolve/audits/2026-05-25-claude-staff-audit.md) flagged the missing composition demo as the 100x-post-worthy artifact. This is it.What it shows
Six phases in ~180 LOC:
runMultishotacross 3 personas + 1 conversation judgeapplyMutation→ AgentProfile v1Default = offline + reproducible
Scripted LLM responses keep the demo deterministic.
TANGLE_API_KEY=... MOCK=0runs it live.Smoke output
Verified
pnpm typecheck✓pnpm test✓ — 284 testspnpm tsx examples/self-improving-loop/self-improving-loop.ts✓Other changes
@tangle-network/agent-eval^0.33.1 → ^0.38.0 so the example resolves/multishot.