The codebase is not yet fully open-sourced; some features may not be supported. We will soon progressively release the complete implementation.
conda create -n codepbt python=3.11 -y
conda activate codepbt
pip install -r requirements.txt
-
Download code generation benchmark dataset (such as Livecodebench) from HuggingFace.
-
In order to unify the format of HumanEval and MBPP with Livecodebench, it is necessary to manually change the format after downloading or directly download the release we provided.
-
Download model (such as Deepseek-R1) from HuggingFace.
-
For running the inference, change model name, path or API KEY in ./lcb_runner/lm_styles.py and change data path in ./lcb_runner/benchmarks/code_generation.py(line_142)
-
Use the following command to perform code generation:
bash script/quick_run.sh [GPU_NUMS, default=1] [MODEL_NAME in lm_styles.py, default="model/DeepSeek-R1-Distill-Qwen-32B"] [DATASET_NAME, default="realse_v5"(in LiveCodeBench)]
- Please check the ./lcb_runner/runner/parser.py file and the ./script/quick_run.sh file for more details on the flags.
Note: The following requirements apply if you are running the model locally and not through an API.
Local execution of this model relies on the vLLM library.
- GPU Requirement:
- A minimum of 1 GPU is required to run the model.
- For optimal performance, running on a single NVIDIA A100 GPU is recommended.
- Supported GPU Count: The current configuration supports execution on 1 to 8 GPUs.
LivecodeBench: The codebase we built upon.