Skip to content

LoCoMo Benchmark 93.05% — full stage artifacts

Latest

Choose a tag to compare

@Kendrick-Song Kendrick-Song released this 02 Jun 08:39
· 29 commits to main since this release

LoCoMo Benchmark 93.05% (majority-vote) / 92.92% (mean-of-runs)

Date: 2026-06-02 | Branch: main (15ae954)

Download & reproduce

cd benchmarks/results && mkdir -p locomo-93.05 && cd locomo-93.05

# Download the archive from the Assets section below, then extract:
tar xzf locomo-93.05-artifacts.tar.gz

# Re-run stages 3-5 from existing stage 1-2 artifacts:
cd ../../..
uv run python -m benchmarks --dataset locomo --run-name 93.05 --stages 3 4 5

Archive

File Size Contents
locomo-93.05-artifacts.tar.gz 204 MB All 5 stages: stage1_extract/ + stage2_index/ + stage3-5 outputs + profile.json + run.log

See benchmarks/results/locomo-93.05/REPRODUCTION.md in the repo for full details.