docs: sync optimization docs to selfImprove + bump agent-eval floor to 0.83 by drewstone · Pull Request #175 · tangle-network/agent-runtime

drewstone · 2026-06-06T13:30:10Z

Hygiene follow-up to #172 (which deleted optimizePrompt + report-eval-runs). The code migrated to selfImprove, but the docs/skills/pins still documented the removed APIs. This syncs everything.

Fixed

README + the shipped adoption SKILL: the optimization story now points at agent-eval's selfImprove (@tangle-network/agent-eval/contract) — agent-runtime contributes only the code-surface improvementDriver. reportOptimizationRun → analyzeRuns; the /improvement export table corrected to its actual exports.
CLAUDE.md + bench/HARNESS.md: agent-eval pin ^0.76.0 → ^0.83.0; optimizePrompt → selfImprove.
package.json peerDependency floor >=0.76.0 → >=0.83.0 — a real correctness fix (selfImprove needs analyzeGeneration, added in 0.83; a consumer on 0.76 would break).
dropped a stale 0.76 comment label in improve-prompt.ts.

Verified

0 remaining optimizePrompt / reportOptimizationRun / ^0.76 refs in tracked source/docs.
examples typecheck clean; root typecheck / lint / build green.
agent-eval is on the latest published (0.83.0).

…mp agent-eval floor PR #172 deleted optimizePrompt + report-eval-runs (selfImprove is the one entry point), but the docs/skills/pins still documented the removed APIs. Synced every surface so the docs match the code: - README + the SHIPPED adoption SKILL: the optimization story now points at agent-eval's selfImprove (@tangle-network/agent-eval/contract) — agent-runtime contributes only the code-surface improvementDriver; reportOptimizationRun → analyzeRuns; /improvement export table corrected to its real exports. - CLAUDE.md + bench/HARNESS.md: agent-eval pin ^0.76.0 → ^0.83.0; optimizePrompt → selfImprove. - package.json peerDependency floor >=0.76.0 → >=0.83.0 (selfImprove needs analyzeGeneration, added in 0.83) — a real correctness fix: a consumer on 0.76 would break. - drop a stale "0.76" comment label in improve-prompt.ts (heldoutSignificance is unchanged). Verified: 0 remaining optimizePrompt/reportOptimizationRun/^0.76 refs in tracked source/docs; examples typecheck clean; root typecheck/lint/build green. agent-eval is on the latest published (0.83.0).

Cuts the 58-commit backlog on main into a published release. Headline surface: - runToolLoop / streamToolLoop — bounded turn-level tool-dispatch loop (#137) - RSI agent tree: recursive Agent.act, Supervisor keystone, runProgram, the adaptive-driver channel (#139/#151/#165) - optimization API collapsed onto agent-eval selfImprove; the runtime keeps the CODE-surface ImprovementDriver you pass as driver (#172) - deployable benchmark adapters: AppWorld, commit0, aec-bench, EnterpriseOps-Gym; runBenchmarks over one ADAPTERS registry (#153/#156/#157) - agent-eval floor raised to >=0.83.0 (#175)

drewstone merged commit 0f18693 into main Jun 6, 2026
1 check passed

drewstone mentioned this pull request Jun 6, 2026

chore(release): agent-runtime 0.45.0 #176

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: sync optimization docs to selfImprove + bump agent-eval floor to 0.83#175

docs: sync optimization docs to selfImprove + bump agent-eval floor to 0.83#175
drewstone merged 1 commit into
mainfrom
chore/docs-sync-selfimprove

drewstone commented Jun 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

drewstone commented Jun 6, 2026

Fixed

Verified

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant