paper2code is a collection of AI/ML research papers rebuilt in Python — stripped of the abstractions that hide what's actually happening.
Every paper gets replicated twice (or close to it):
- 🧱 From scratch — raw NumPy / pure PyTorch tensors. No
nn.Modulemagic, notorch.optim, no black boxes. You read the math, you write the math. - 📦 With libraries — the canonical high-level implementation, so you can diff the two and see exactly what the library is doing for you.
The goal isn't benchmark-chasing. It's understanding — by the time you've written backprop by hand once, loss.backward() stops feeling like magic.
| Paper | From scratch | With library | Tests | Folder |
|---|---|---|---|---|
| Attention Is All You Need (Vaswani 2017) | 🟢 | 🟢 | 10 | attention/ |
| An Image is Worth 16x16 Words — ViT (Dosovitskiy 2020) | 🟢 | 🟢 | 7 | vit/ |
| Adam: A Method for Stochastic Optimization (Kingma 2014) | 🟢 | 🟢 | 6 | adam/ |
| Long Short-Term Memory (Hochreiter 1997) | 🟢 | 🟢 | 12 | lstm/ |
| RNN Encoder–Decoder / GRU (Cho 2014) | 🟢 | 🟢 | 8 | gru/ |
| Faster R-CNN (Ren 2015) — key atoms | 🟡 | 🟢 | 14 | faster-rcnn/ |
| SSD: Single Shot MultiBox Detector (Liu 2016) — key atoms | 🟡 | 🟢 | 11 | ssd/ |
| Bits & pieces from scratch | 🟢 | — | — | things-from-scratch/ |
🟢 done · 🟡 partial (detection papers — RPN/default-boxes/NMS/HNM from scratch; full pipeline via torchvision) · 🔴 todo
Total: 68 tests, ~1.5s to run the whole repo suite.
git clone https://github.com/hegdeadithyak/PaperReplica.git
cd PaperReplica
pip install -r requirements.txt
# run every test across every paper
python3 -m pytest -vEach paper folder is self-contained: {name}_scratch.py + {name}_library.py + test_{name}.py + a README with the math, a diagram, and first-principles explanation. cd in and read the README.
Every paper lives in a folder with the same four files:
<paper-dir>/
<name>_scratch.py # raw PyTorch tensor ops — no nn.Module, no autograd for RNNs
<name>_library.py # torch.nn / torchvision / torch.optim equivalent
test_<name>.py # shared-weight parity tests — scratch == library to ~1e-5
README.md # math, first-principles walkthrough, diagram from Wikimedia
Scratch impls expose their weights in the same layout as the library module, so tests can copy weights across with load_from_torch_*() and verify bit-identical outputs. No hand-waving, no "similar behavior" — numerical equivalence or the test fails.
- Vaswani et al. — Attention Is All You Need
- Dosovitskiy et al. — An Image is Worth 16x16 Words
- Kingma & Ba — Adam: A Method for Stochastic Optimization
- Hochreiter & Schmidhuber — Long Short-Term Memory
- Cho et al. — Learning Phrase Representations using RNN Encoder-Decoder
- Ren et al. — Faster R-CNN
- Liu et al. — SSD: Single Shot MultiBox Detector
MIT — see LICENSE.
built for the people who'd rather read the paper than the docs
