Skip to content

MaxwellJryao/Process-Reward-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Process Reward Learning

The code implementation of Process Reward Learning (PRL).

Data Preparation

python examples/data_preprocess/numina_math.py
python examples/data_preprocess/math500.py

Training

bash runs/run_grpo.sh

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published