Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Language Models

Official implementation of AdapT in AAAI 2024 paper.

Getting Started

Installation

Python 3.7+ / CUDA 11+ / PyTorch 1.10+ / DeepSpeed 0.6+ are required. Install

cd AdapT
pip install -e .

Model Weights

Apply and download model weights through this link, then you can receive by mail urls.txt that contains temporary download links.

To downlad model weights, use aria2 to download it via the following command:

aria2c -x 16 -s 16 -j 4 --continue=true -i urls.txt

Run the following command to get the full model weights:

cat codegeex_13b.tar.gz.* > codegeex_13b.tar.gz
tar xvf codegeex_13b.tar.gz

Inference

Sampling results generated with AdapT sampling on HumanEval and MBPP are shown in the output files (i.e. output/humaneval_adapt_samples.jsonl, output/mbpp_adapt_samples.jsonl)

You can follow the instructions below to get these results:

HumanEval

To evaluate the AdapT sampling on HumanEval dataset, with 15 candidates generated, and adaptive temperatures of [0.8,0.6], run

sh scripts/inference_adapt.sh 0 inputs/HumanEval.jsonl inputs/human_eval_stop_words.json he_samples.jsonl 0.5 0.8 0.6

MBPP

To evaluate the AdapT sampling on MBPP dataset, with 20 candidates generated, and adaptive temperatures of [0.8,0.3], run

sh scripts/inference_adapt.sh 0 inputs/mbpp_test.jsonl inputs/mbpp_stop_words.json mbpp_samples.jsonl 0.5 0.6 0.5

Evaluation

To evaluate the generated results, install the mxeval repository:

git clone https://github.com/amazon-science/mxeval.git
pip install -e mxeval

HumanEval

To get the evaluation results of HumanEval dataset, run the following command:

evaluate_functional_correctness output/humaneval_adapt_samples.jsonl --problem_file inputs/HumanEval.jsonl --k 1,5,10,15

MBPP

To get the evaluation results of MBPP dataset, run the following command:

evaluate_functional_correctness output/mbpp_adapt_samples.jsonl --problem_file inputs/mbpp_test.jsonl --k 1,5,10,15

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
api		api
codegeex		codegeex
configs		configs
converted		converted
human-eval		human-eval
inputs		inputs
mxeval		mxeval
new-output		new-output
output		output
resources		resources
scripts		scripts
tests		tests
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
analysis.py		analysis.py
convert.py		convert.py
environment.yml		environment.yml
error_analysis.py		error_analysis.py
requirements.txt		requirements.txt
setup.py		setup.py
test.py		test.py
urls.txt		urls.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Language Models

Getting Started

Installation

Model Weights

Inference

HumanEval

MBPP

Evaluation

HumanEval

MBPP

About

Releases

Packages

Languages

License

LJ2lijia/AdapT

Folders and files

Latest commit

History

Repository files navigation

Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Language Models

Getting Started

Installation

Model Weights

Inference

HumanEval

MBPP

Evaluation

HumanEval

MBPP

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages