A self-improving experiment lab for AI agents. Point leanlab at a metric and a team of Claude agents — a Worker, a Director, and a Critic — evolves ML / optimization experiments against a frozen evaluator, while you watch on a live dashboard.
pipx install leanlab # or: pip install leanlab · uvx leanlab📦 On PyPI: pypi.org/project/leanlab
Requires Python 3.11+ and the claude CLI (the agents run on Claude Code).
leanlab runs inside your own project — each lab lives in a .leanlab/<name>/
folder; the engine stays in the installed tool.
Evolve a number (ML, optimization, anything that prints a score):
cd ~/my-project
leanlab init iris # describe the task; Claude drafts the lab + scorer
leanlab check iris # verify it's wired correctly (free)
leanlab lock iris # freeze the scorer
leanlab run iris --n 5 # the agents evolve experiments (uses Claude)
leanlab serve iris # watch the live dashboardThe Worker invents an experiment each round, the Critic red-teams it, and the Director steers the next round — all scored by the frozen evaluator you locked.
- docs/USAGE.md — every command, in order, with examples.
- docs/OVERVIEW.md — how it works: the loop, the agents, and the project structure.
- CONTRIBUTING.md — local development (uv, tests).
MIT licensed — see LICENSE.