This repository contains the archival release of the code for the paper:
ABEL: Sample Efficient Online Reinforcement Learning for Neural Theorem Proving Fabian Gloeckle, Jannis Limperg, Gabriel Synnaeve, Amaury Hayat The 4th Workshop on Mathematical Reasoning and AI at NeurIPS'24
@inproceedings{
gloeckle2024abel,
title={{ABEL}: Sample Efficient Online Reinforcement Learning for Neural Theorem Proving},
author={Fabian Gloeckle and Jannis Limperg and Gabriel Synnaeve and Amaury Hayat},
booktitle={The 4th Workshop on Mathematical Reasoning and AI at NeurIPS'24},
year={2024},
url={https://openreview.net/forum?id=kk3mSjVCUO}
}Path configuration: See xlean/config.py for setting up data, checkpoint, and tokenizer paths.
This codebase was developed with internal Meta infrastructure and requires some adaptation to run with open-source alternatives. In particular, you will need to replace the following internal dependencies:
The code expects a fast LLM inference library. You can use:
- fastgen - Facebook Research's fast generation library
- vLLM - High-throughput LLM serving
- SGLang - Fast serving framework
The code expects an LLM training framework. You can use:
- Lingua - Facebook Research's LLM training codebase
- TRL - Hugging Face's Transformer Reinforcement Learning library
- torchtune - PyTorch native fine-tuning library
Some tinkering will be required to adapt the import paths and APIs to your chosen libraries. The main integration points are:
- Model loading and checkpoint management
- Tokenizer interfaces
- Distributed training utilities (SLURM integration)
- Online generation during RL training
This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
See LICENSE for details.