Skip to content

WUBING2023/ReproRun

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

English ยท ็ฎ€ไฝ“ไธญๆ–‡ ยท Espaรฑol ยท Franรงais ยท Deutsch ยท ๆ—ฅๆœฌ่ชž ยท ํ•œ๊ตญ์–ด ยท Portuguรชs ยท ะ ัƒััะบะธะน

ReproRun

Give it a paper. It tells you whether the results actually reproduce.

ReproRun is a portable AI-agent skill โ€” usable by any compatible AI coding assistant โ€” that automates the painful path from "a paper claims X" to "X actually runs on my machine." Papers ship beautiful numbers; reproducing them usually dies in broken code, unbuildable environments, and version drift. ReproRun handles the whole pipeline end to end:

read paper โ†’ find code & data โ†’ build environment โ†’ smoke test โ†’ full run โ†’ compare measured numbers against the paper's claims.


โœจ Built to adapt โ€” across every platform

ReproRun is designed to adapt itself to whatever it's handed, instead of assuming one fixed setup:

  • Cross-OS โ€” runs on Windows, macOS, and Linux
  • Cross-language โ€” full pipeline for Python; minimal path for R / MATLAB / Julia
  • Cross-domain โ€” single-cell biology, image ML, and more, with no per-domain rewiring
  • Cross-mode โ€” tool mode (pip install + write a calling script) or experiment mode (clone the repo & run its scripts) โ€” auto-selected per paper

๐Ÿš€ What it does

  • Find a paper from just its title โ€” auto web-search for PDF & code repo
  • Auto-diagnose environment bit-rot โ€” numpy ABI clashes, torchvision API changes, deprecated pandas methodsโ€ฆ detected and fixed automatically
  • No guessing parameters โ€” inspects function signatures after install
  • 5-round dependency self-healing loop โ€” classify error โ†’ targeted fix โ†’ re-verify, up to 5 rounds
  • Paper isolation โ€” every paper gets its own output namespace

๐Ÿ—๏ธ Architecture

One orchestrator (SKILL.md) drives 6 specialized agents:

Agent Role
A ยท paper-reader Extract the numerical claims to reproduce
B ยท resource-finder Locate code repo & datasets
C ยท environment-builder Build & repair the runtime (most complex)
D ยท smoke-tester Quick smoke test โ€” confirm it runs
E ยท full-runner Full reproduction run
F ยท result-comparator Compare measured vs. claimed, item by item

โœ… Validation

ReproRun has been run end to end on real papers across domains. It doesn't just rubber-stamp "success" โ€” for every paper it returns an honest verdict: numbers reproduced, pipeline reproduced, or can't reproduce as-is โ€” always with the root cause.

Paper Domain Result
UMAP (McInnes 2018, JOSS) dimensionality reduction โœ… Numbers reproduced โ€” 11/14 k-NN accuracy metrics match within ยฑ0.01; MNIST & Fashion-MNIST confirmed to 3 decimals
scVelo (Bergen 2020, Nat Biotech) single-cell โœ… Pipeline reproduced โ€” caught a numpy 2.x ABI bug causing 100% NaN, fixed by downgrading to 1.26.4
Robust Stitching (Ruiz 2023, ICML) image ML โœ… Pipeline reproduced โ€” repaired 5 bit-rot breakages (torchvision API, pandas append, missing deps)
Annotatability (Nitzan 2024, Nat Comp Sci) single-cell โœ… Pipeline reproduced โ€” 6 API-debug rounds surfaced a missing pooch dependency
ScType (Ianevski 2022, Nat Comms) single-cell (R) โœ… Pipeline reproduced โ€” R path validated after a version downgrade
Cropformer (Wang 2025, Plant Communications) crop genomics โš ๏ธ Partial โ€” code, model & training verified, but the repo ships only 10 demo samples, so the paper's PCC=0.92 can't be reproduced as-is

6 papers ยท 1 full-metric reproduction ยท 4 pipeline reproductions ยท 1 honest partial ยท 18 verified skill improvements

Honest by design. UMAP landed at 78.6% metric match โ€” just below our 80% "clean reproduce" bar โ€” and ReproRun reports it as a data contradiction rather than rounding up. For Cropformer, the framework runs end to end, but the published numbers need real crop-genome data the repo never ships โ€” so it's flagged Partial, not Pass.

Case study โ€” Cropformer (โš ๏ธ Partial reproduction)

Verified โœ…

  • Repo found & cloned (jiekesen/Cropformer; the paper's URL was wrong)
  • Environment built โ€” Python 3.10 + PyTorch 2.5.1 + CUDA 12.1
  • Model architecture โ€” Conv1d + 8-head self-attention (2.6M params)
  • GPU inference on RTX 4090; pretrained weights load & run
  • Training loop converges โ€” loss 89,540 โ†’ 23,771

Could not reproduce โŒ

  • Paper metrics (PCC=0.92, โ€ฆ) โ€” repo ships only 10 random demo samples, not real crop data
  • Classification task โ€” model_class.py is missing key functions
  • Nested cross-validation, MIC feature selection, 0โ€“9 SNP encoding โ€” not implemented in the repo

Root cause: the public repo is demo-only; the full pipeline (MIC selection, nested CV, Optuna tuning) and the real datasets are not included. A faithful reproduction would need the real crop-genome data โ†’ PLINK processing โ†’ reimplementing the described pipeline (~1โ€“2 weeks of data + compute work).


๐Ÿ“ฆ Getting started

ReproRun is an agent skill that works with any compatible AI coding assistant. To use it:

  1. Use an AI coding agent that can load skills.
  2. Place the paper-reproduction/ folder where your agent loads skills.
  3. In a session, just ask โ€” e.g. "reproduce scVelo" or "reproduce Table 2 from this paper." The skill triggers automatically.

๐Ÿ‘ฅ Team

Role Member
Chief Architect @WUBING2023
Development Engineer @TXZ-star
Test Engineer @qaqcrane
Operations @wanzi5872-oss

๐Ÿ“„ License

Non-commercial use only. You are free to use and modify ReproRun for non-commercial purposes (research, study, personal projects). Commercial use is not permitted without prior permission.

License: PolyForm Noncommercial License 1.0.0 ยท ไธญๆ–‡็‰ˆ๏ผšๆŸฅ็œ‹ไธญๆ–‡ๅ่ฎฎ โ†’


๐Ÿ“Œ Status

v1.0.0 ยท stable (maintenance mode)

About

๐Ÿ”ฌ A skill that automatically reproduces the numerical claims of academic papers โ€” read paper โ†’ find code โ†’ build env โ†’ run โ†’ compare against the paper. 1 orchestrator + 6 specialized agents.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors