GitHub - Zyq-scut/RLTF: Accepted by Transactions on Machine Learning Research (TMLR)

RLTF: Reinforcement Learning from Unit Test Feedback

This is the official code for the paper RLTF: Reinforcement Learning from Unit Test Feedback.

Installation

The code requires some dependencies as specified in requirements.txt. Please follow the relevant libraries to install or run:

pip install -r requirements.txt

Datasets

APPS: Please follow the downloading and preprocessing instructions provided here.
MBPP: The dataset is available here.

Download and unzip all files into the data folder.

Models

https://huggingface.co/Harvey6/RLTF_codet5

Processes

Surprised Finetune

CodeT5: sh script/train_actor_deepspeed.sh
CodeGEN: sh script/train_actor_codegen_deepspeed.sh

Generating Programs Online

CodeT5: python script/generate_online_parallel.py
CodeGEN: python script/generate_codegen_online_parallel.py

Online RL Finetune

After running the online generation for a short period and accumulating a certain number of samples：

CodeT5: sh script/train_actor_rl_online_v1_deepspeed.sh
CodeGEN: sh script/train_actor_rl_codegen_online_v1_deepspeed.sh

Generate Program, Run Unit Test, Compute pass@k

Generate Program:

CodeT5: python script/generate_parallel.py
CodeGEN: python script/generate_parallel_codegen.py

Run Unit Test：

sh script/run_unit_tests.sh

Compute pass@k：

python compute_pass_at_k_metric.py

Citation

If you find the paper or the source code useful to your projects, please cite the following bibtex:

@article{
      liu2023rltf,
      title={{RLTF}: Reinforcement Learning from Unit Test Feedback},
      author={Jiate Liu and Yiqin Zhu and Kaiwen Xiao and QIANG FU and Xiao Han and Yang Wei and Deheng Ye},
      journal={Transactions on Machine Learning Research},
      issn={2835-8856},
      year={2023},
      url={https://openreview.net/forum?id=hjYmsV6nXZ},
      note={}
}

License

The code is released under BSD 3-Clause - see LICENSE.txt for details.

This code is developed from other open source projects: including CodeRL, APPS, and transformers. We thank the original contributors of these works for open-sourcing their valuable source codes.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
data		data
datasets		datasets
evaluate		evaluate
outputs		outputs
scripts		scripts
trainers		trainers
utils		utils
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
compute_pass_at_k_metric.py		compute_pass_at_k_metric.py
generate.py		generate.py
generate_codegen.py		generate_codegen.py
generate_codegen_online.py		generate_codegen_online.py
generate_online.py		generate_online.py
program_feedback.py		program_feedback.py
requirements.txt		requirements.txt
test_one_solution.py		test_one_solution.py
train.py		train.py
train_codegen.py		train_codegen.py
train_codegen_online_v1.py		train_codegen_online_v1.py
train_online_v1.py		train_online_v1.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RLTF: Reinforcement Learning from Unit Test Feedback

Installation

Datasets

Models

Processes

Surprised Finetune

Generating Programs Online

Online RL Finetune

Generate Program, Run Unit Test, Compute pass@k

Citation

License

About

Releases

Packages

Contributors 2

Languages

License

Zyq-scut/RLTF

Folders and files

Latest commit

History

Repository files navigation

RLTF: Reinforcement Learning from Unit Test Feedback

Installation

Datasets

Models

Processes

Surprised Finetune

Generating Programs Online

Online RL Finetune

Generate Program, Run Unit Test, Compute pass@k

Citation

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages