Replicating 'Reinforcement Learning from Language Feedback' (Klissarov et al., 2026) — Gemma 3 12B, GRPO, multi-turn teacher-student training on Omni-MATH
-
Updated
Apr 5, 2026 - Python
Replicating 'Reinforcement Learning from Language Feedback' (Klissarov et al., 2026) — Gemma 3 12B, GRPO, multi-turn teacher-student training on Omni-MATH
Unofficial, From-scratch PyTorch replication of the Conformer paper (Gulati et al., 2020) — encoder, RNN-T decoder, training loop, and NeMo weight validation. Built block by block with documented maths.
Replication of "Neural Collaborative Filtering" (He et al., WWW 2017) in PyTorch — GMF, MLP & NeuMF on MovieLens 1M + Pinterest
Add a description, image, and links to the paper-replication topic page so that developers can more easily learn about it.
To associate your repository with the paper-replication topic, visit your repo's landing page and select "manage topics."