MSARL

Directory Structure

framework/
  ├── train.py   # Training script
  ├── eval.py    # Evaluation script

Requirements

Python >= 3.8
CUDA-enabled GPU
Other dependencies listed in requirements.txt

Example installation:

pip install -r requirements.txt

Training

Run the following command to start model training:

python framework/train.py \
  --reasoning_model_path Qwen2.5-Math-1.5B-Instruct/ \
  --code_model_path Qwen2.5-1.5B-Instruct/ \
  --dataset_path math_train.jsonl \
  --checkpoint_dir checkpoints/ \
  --save_steps 25 \
  --n_sample_r 4 \
  --n_sample_c 4 \
  --max_tokens 4096 \
  --reasoning_device cuda:0 \
  --code_device cuda:1 \
  > train_1.5B_base.log 2>&1

Logs will be saved to train_1.5B_base.log.

Evaluation

Run the following command to evaluate a trained model:

python framework/eval.py \
  --reasoning_model_path Qwen2.5-Math-1.5B-Instruct/ \
  --code_model_path Qwen2.5-1.5B-Instruct/ \
  --dataset_path math_test.jsonl \
  --output_path eval/ \
  --batch_size 64 \
  --num_reasonings 1 \
  --max_tokens 4096 \
  --device cuda:0

Evaluation results will be stored under the eval/ directory.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
framework		framework
README.md		README.md
math_train.jsonl		math_train.jsonl
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MSARL

Directory Structure

Requirements

Training

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MSARL

Directory Structure

Requirements

Training

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages