GitHub - superfarther/verl

What is this repo

This repo contains the framework used by K2V to train models. We developed this framework based on verl.

Installation

We recommend to use a fresh new conda environment to install verl and its dependencies.

conda create --name verl python=3.11 -y
conda activate verl

Install the necessary dependencies.

git clone https://github.com/superfarther/verl.git
pip install -r requirements_K2V.txt

Install the verl from source.

pip install --no-deps -e .

K2V uses vLLM as the inference framework. Notice that vLLM often strictly limit your pytorch version and will directly override your installed pytorch. As a countermeasure, it is recommended to install vLLM first with the pytorch they needed. Overall, we need to ensure that the versions of the following dependencies are consistent with those specified in requirements_K2V.txt.

torch and torch series
vLLM
pyarrow
tensordict
nvidia-cudnn-cu12

Quick Start

Deploy a judge model using vLLM to verify the model's reasoning process. For example, we can use Qwen2.5-7B-Instruct as the judge model.
```
CUDA_VISIBLE_DEVICES=4,5,6,7 vllm serve Qwen/Qwen2.5-7B-Instruct--tensor-parallel-size 4 --gpu_memory_utilization 0.7 
```
We provide example data, which is stored in the K2V-example/data. Additionally, a example configuration file is available at K2V-example/config.sh. Before starting the training, you need to fill in the relevant paths in the configuration file.
- train_files: Path of training data
- val_files: Path of validation data
- rollout_data_dir: Rollout data generated during training will be saved to this directory.
- validation_data_dir: Validation result will be saved to this directory.
- default_local_dir: Checkpoint will be saved to this directory.
- log_file: Path of log file
- checklist_judge_model_url: Service endpoint for the judge model deployed with vLLM.
Start training
```
bash K2V-example/config.sh
```

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
K2V-example		K2V-example
docker		docker
docs		docs
examples		examples
recipe		recipe
scripts		scripts
tests		tests
verl		verl
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
Notice.txt		Notice.txt
README.md		README.md
pyproject.toml		pyproject.toml
requirements-npu.txt		requirements-npu.txt
requirements.txt		requirements.txt
requirements_K2V.txt		requirements_K2V.txt
requirements_sglang.txt		requirements_sglang.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is this repo

Installation

Quick Start

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What is this repo

Installation

Quick Start

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages