FastCoder: Accelerating Repository-level Code Generation via Efficient Retrieval and Verification

FastCoder is a simple yet highly efficient approach for accelerating LLM inference specifically designed for code generation, without comprising the quality of the output.

Benchmark results

Performance on repository-level code generation (DevEval & RepoEval)
Performance on standalone-level code generation (HumanEval)

Below is an example. FastCoder completes the inference in just 4.2 seconds, while REST and autoregressive methods takes 6.2 seconds and 13.5 seconds, respectively. FastCoder demonstrates a 3.21x speedup compared to the autoregressive decoding and a 1.48x acceleration over REST.

The inference speeds of FastCoder and REST are comparable at the beginning. However, at approximately 2.5 seconds, the context- and LLM preference-aware cache of FastCoder becomes activated, and FastCoder achieves a substantial acceleration in the subsequent phases of inference.

Additionally, we have developed a VSCode plugin demo based on FastCoder, with more functionalities currently under development.

Installation

conda create -n fastcoder python=3.9
conda activate fastcoder
pip install -r requirements.txt
pip install DraftRetriever/wheels/draftretriever-0.1.0-cp39-cp39-linux_x86_64.whl

Build datastore

Build the common datastore

Build a common datastore from The Stack

cd datastore
python get_common_datastore.py --model-path deepseek-ai/deepseek-coder-6.7b-base

Build the repo datastore

Build a repo datastore for each repository from the source code after pre-processed (with the portions to be generated excluded).

Users can download the original repositories of DevEval from DevEval_Source_Code ( RepoEval from RepoEval_Source_Code). To exclude the ground truth from the original repositories, users should first place the downloaded archive into the designated directory dataset/DevEval(dataset/RepoEval), rename it to source_code.zip, and extract its contents. After that, execute the following command and the dataset/DevEval/source_code directory will contain the original repositories with all ground truth content removed.

cd dataset/DevEval
python filter_source_code.py

Alternatively, users can also directly download our preprocessed source code (with ground truth removed) from the link, place it in the designated directory, and extract it.

Then, use following commands to build a repo datastore.

cd datastore/DevEval
unzip source_code.zip
python3 get_repo_datastore.py --model-path deepseek-ai/deepseek-coder-6.7b-base --dataset DevEval

Inference

Inference on DevEval

cd evaluation
CUDA_VISIBLE_DEVICES=0 python deveval_test.py --model-path deepseek-ai/deepseek-coder-6.7b-base --p 0.5 --l 50 --s 20 --weights [1,1]

Inference on RepoEval

cd evaluation
CUDA_VISIBLE_DEVICES=0 python repoeval_test.py --model-path deepseek-ai/deepseek-coder-6.7b-base --p 0.5 --l 50 --s 20 --weights [1,1]

Inference on HumanEval

cd evaluation
CUDA_VISIBLE_DEVICES=0 python humaneval_test.py --model-path deepseek-ai/deepseek-coder-6.7b-base --p 0.5 --l 50 --s 20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FastCoder: Accelerating Repository-level Code Generation via Efficient Retrieval and Verification

Contents

Benchmark results

Installation

Build datastore

Build the common datastore

Build the repo datastore

Inference

Inference on DevEval

Inference on RepoEval

Inference on HumanEval

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
DraftRetriever		DraftRetriever
dataset		dataset
datastore		datastore
evaluation		evaluation
fastcoder		fastcoder
images		images
README.md		README.md
requirements.txt		requirements.txt

whisperzqh/FastCoder

Folders and files

Latest commit

History

Repository files navigation

FastCoder: Accelerating Repository-level Code Generation via Efficient Retrieval and Verification

Contents

Benchmark results

Installation

Build datastore

Build the common datastore

Build the repo datastore

Inference

Inference on DevEval

Inference on RepoEval

Inference on HumanEval

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages