Skip to content

jl-python/Reproducibility-Lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reproducibility-Lab

This is an in class lab we focused on the reproducibility of GitHub Repos.
Reproducibility Lab Submission was for Maia-2 (NeurIPS 2024)

This repository contains my reproducibility work for the NeurIPS 2024 paper:

Maia-2: A Unified Model for Human-AI Alignment in Chess Zhenwei Tang, Difan Jiao, Reid McIlroy-Young, Jon Kleinberg, Siddhartha Sen, Ashton Anderson

Repository Contents

reproducibility_lab.ipynb — Executed Colab notebook with all baseline + variation experiments, plus reflections.

runs/ — Experiment outputs: JSON files for baseline/variation, both batch and position-wise inference.

logs/ — Environment metadata (env_meta.json) and package lockfile (requirements.lock.txt).

reproduce/ —

* Reproducibility.md — Full reproducibility report (environment, commands, results, reflection).

* run_all.sh — Script to automatically reproduce baseline + variation runs and save outputs/logs.

Environment Setup

Clone the upstream Maia-2 repo (not included here):

git clone https://github.com/CSSLab/maia2.git cd maia2

Create a fresh environment (Python 3.10+ recommended):

python -m venv .venv source .venv/bin/activate pip install -U pip

Install dependencies:

pip install chess==1.10.0 einops==0.8.0 gdown==5.2.0 numpy==2.1.3 pandas==2.2.3
pyzstd==0.15.9 requests==2.32.3 torch==2.4.0 tqdm==4.65.0

How to Reproduce Results

From the root of this repo, run the shell script:

bash reproduce/run_all.sh

This will:

  • Run batch inference (baseline + variation).

  • Run position-wise inference (baseline + variation).

  • Save all outputs under runs/.

  • Save logs and environment snapshots under logs/.

  • Alternatively, open reproducibility_lab.ipynb and re-execute cells manually in Colab or Jupyter.

Expected Results

Batch Inference: Accuracy numbers comparable between baseline and variation (with minor changes depending on batch size/model type).

Position-wise Inference: Predicted moves and win probabilities logged, showing changes when ELO parameters are varied.

Exact numbers may differ slightly depending on hardware (CPU vs GPU) and random seeds. Our experiment utilized a random seed of 42 and GPU A100 on Google Colab.

Notes

The upstream repo (CSSLab/maia2) is not included here — only reproducibility artifacts.

Logs and outputs in this repo were generated using Google Colab (GPU runtime).

Non-determinism may arise from GPU differences and API randomness.

About

This is an in class lab that focused on reproducibility of GitHub Repos.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published