ABEL: Sample Efficient Online Reinforcement Learning for Neural Theorem Proving

This repository contains the archival release of the code for the paper:

ABEL: Sample Efficient Online Reinforcement Learning for Neural Theorem Proving Fabian Gloeckle, Jannis Limperg, Gabriel Synnaeve, Amaury Hayat The 4th Workshop on Mathematical Reasoning and AI at NeurIPS'24

📄 Paper on OpenReview

Citation

@inproceedings{
gloeckle2024abel,
title={{ABEL}: Sample Efficient Online Reinforcement Learning for Neural Theorem Proving},
author={Fabian Gloeckle and Jannis Limperg and Gabriel Synnaeve and Amaury Hayat},
booktitle={The 4th Workshop on Mathematical Reasoning and AI at NeurIPS'24},
year={2024},
url={https://openreview.net/forum?id=kk3mSjVCUO}
}

Setup

Path configuration: See xlean/config.py for setting up data, checkpoint, and tokenizer paths.

Dependencies

This codebase was developed with internal Meta infrastructure and requires some adaptation to run with open-source alternatives. In particular, you will need to replace the following internal dependencies:

Inference (`xlean.fastgen` / `xlean.codegen.xlformers.fastgen`)

The code expects a fast LLM inference library. You can use:

fastgen - Facebook Research's fast generation library
vLLM - High-throughput LLM serving
SGLang - Fast serving framework

Training (`xlean.codegen.xlformers`)

The code expects an LLM training framework. You can use:

Lingua - Facebook Research's LLM training codebase
TRL - Hugging Face's Transformer Reinforcement Learning library
torchtune - PyTorch native fine-tuning library

Setup Notes

Some tinkering will be required to adapt the import paths and APIs to your chosen libraries. The main integration points are:

Model loading and checkpoint management
Tokenizer interfaces
Distributed training utilities (SLURM integration)
Online generation during RL training

License

This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).

See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.vscode		.vscode
docs		docs
informalize		informalize
scripts		scripts
sweeps		sweeps
xlean		xlean
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
launch.py		launch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ABEL: Sample Efficient Online Reinforcement Learning for Neural Theorem Proving

Citation

Setup

Dependencies

Inference (`xlean.fastgen` / `xlean.codegen.xlformers.fastgen`)

Training (`xlean.codegen.xlformers`)

Setup Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ABEL: Sample Efficient Online Reinforcement Learning for Neural Theorem Proving

Citation

Setup

Dependencies

Inference (xlean.fastgen / xlean.codegen.xlformers.fastgen)

Training (xlean.codegen.xlformers)

Setup Notes

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Inference (`xlean.fastgen` / `xlean.codegen.xlformers.fastgen`)

Training (`xlean.codegen.xlformers`)

Packages