CodeGeeX FasterTransformer

This repository provides the fastertrasformer implementation of CodeGeeX model.

Get Started

First, download and setup the following docker environment, replace <WORK_DIR> by the directory of this repo:

docker pull nvcr.io/nvidia/pytorch:21.11-py3
docker run -p 9114:5000 --cpus 12 --gpus '"device=0"' -it -v <WORK_DIR>:/workspace/codegeex-fastertransformer --ipc=host  --name=test nvcr.io/nvidia/pytorch:21.11-py3

Second, install following packages in the docker:

pip3 install transformers
pip3 install sentencepiece
cd codegeex-fastertransformer
sh make_all.sh  # Remember to specify the DSM version according to the GPU.

Then, convert the initial checkpoint (download here) to FT version using get_ckpt_ft.py.

Finally, run api.py to start the server and run post.py to send request:

nohup python3 api.py > test.log 2>&1 &
python3 post.py

Inference performance

The following figure compares the performances of pure Pytorch, Megatron and FasterTransformer under INT8 and FP16. The fastest implementation is INT8 + FastTrans, and the average time of generating a token <15ms.

Liscense

Our code is licensed under the Apache-2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
3rdparty		3rdparty
benchmarks		benchmarks
cmake		cmake
docs/images		docs/images
examples		examples
src		src
templates/adding_a_new_model		templates/adding_a_new_model
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
api.py		api.py
get_ckpt_ft.py		get_ckpt_ft.py
make_all.sh		make_all.sh
post.py		post.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CodeGeeX FasterTransformer

Get Started

Inference performance

Liscense

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

CodeGeeX/codegeex-fastertransformer

Folders and files

Latest commit

History

Repository files navigation

CodeGeeX FasterTransformer

Get Started

Inference performance

Liscense

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages