Three meta-agents built on Shepherd, each operating on another agent's execution trace: live runtime intervention, counterfactual optimization, and tree-RL training.
Experiment code for the Shepherd paper, a runtime substrate for meta-agents over LLM agent execution. This repository holds the four meta-agent applications evaluated in the paper plus the framework-performance microbenchmarks, together with the frozen substrate snapshot they were run against.
The actively maintained substrate library lives at
shepherd-agents/shepherd. The copy
under code/ here is pinned to the exact version used for these experiments, so
the numbers stay reproducible as the upstream library moves on.
Links: Paper · Homepage · Docs · Shepherd library
Simon Yu*1, Derek Chong*2, Ananjan Nandi*2, Dilara Soylu2, Jiuding Sun2,
Christopher D. Manning2, Weiyan Shi1
1 Northeastern University · 2 Stanford University · * equal contribution
exp/
├── framework-perf/ # microbenchmarks: fork/revert latency, KV-cache reuse
├── live-intervention/ # live supervisor over parallel agents on CooperBench
├── cbo/ # Counterfactual Replay Optimization (CRO)
├── mcts-rl/ # meta-agent-guided Tree-GRPO RL training
└── trajprune/ # trajectory pruning (appendix)
code/ # frozen substrate snapshot (the `agentic` uv workspace)
Each subproject under exp/ has its own README.md with setup and entry-point
details. The substrate packages under code/ are documented in
code/README.md.
The repository root is a uv workspace. From the root:
uv sync
This resolves the substrate packages in code/ and the experiment subprojects in
exp/ against them, with no external package index needed for the substrate
itself. Each experiment then runs from its own directory; see its README.
If you use this code, please cite the paper (the BibTeX entry is on the paper page).
Released under the MIT License. See LICENSE.
