Releases · dl1683/irys-stateful-swarms

Full Harvey LAB Benchmark Outputs

Complete outputs from the full 1,251-task Harvey Legal Agent Benchmark run.

Download any practice area archive, extract it, and score against the Harvey LAB scorer:

python -m src.cli score <extracted_dir> --bench-root /path/to/harvey-labs

Each task directory contains:

output/ — the generated deliverables (docx, xlsx)
swarm/ — full blackboard state showing how the system reasoned (see README for walkthrough)
scores.json — per-criterion scoring results
metrics.json — token usage and cost breakdown

Archives will be uploaded as they are prepared. Each archive contains all tasks for one practice area.