Doc for Zebra-Inference

This Repo is release for play with Zebra-7b-v2-lit model. This Repo is forked from Huggingface Transformers-Bloom-Inference. We Implement the zebra model as a patch to original transformers and directly utilize the mentioned repo as an interface to our model.

Resource

Citation

@article{song2023zebra,
      title={Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention}, 
      author={Kaiqiang Song and Xiaoyang Wang and Sangwoo Cho and Xiaoman Pan and Dong Yu},
      year={2023}
}

Play with Zebra

Step 1: Enviorment

Build a conda enviorment throught the below command line.

conda create -n zebra-inference python=3.9
conda activate zebra-inference
conda install -c anaconda cmake -y

pip install torch==1.12.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116 \
    transformers==4.31.0 \
    deepspeed==0.7.6 \
    accelerate==0.20.3 \
    gunicorn==20.1.0 \
    flask==2.3.0 \
    werzeug==2.3.0 \
    flask_api \
    fastapi==0.89.1 \
    uvicorn==0.19.0 \
    jinja2==3.1.2 \
    pydantic==1.10.2 \
    grpcio-tools==1.50.0 \
    sentencepiece \
    --no-cache-dir

Step 2: Download the model from HF-models

Download the model with git.

git lfs install
git clone https://huggingface.co/kqsong/zebra-7b-lcat-v2-lit

Step 3: Launch the Zebra

bash launch_zebra_all.sh <path/to/model>

This will launch both frontend(port:5001) and backend(port:5000).

Step 4: Open Web Browser to play with Zebra

Please visit the localhost with port, or through your ip address and the port.

https://0.0.0.0:5051
https://<ip_address>:5051

LICENSE

The code licensed under the Apache-2.0 License
The model licensed under the Llama-2 License

Disclaimer

This repo is only for research purpose. It is not an officially supported Tencent product.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
bloom-inference-scripts		bloom-inference-scripts
inference_server		inference_server
static		static
templates		templates
zebra_patch		zebra_patch
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
launch_zebra_all.sh		launch_zebra_all.sh
launch_zebra_backend.sh		launch_zebra_backend.sh
launch_zebra_frontend.sh		launch_zebra_frontend.sh
setup.cfg		setup.cfg
ui.py		ui.py

License

tencent-ailab/zebra-inference

Folders and files

Latest commit

History

Repository files navigation

Doc for Zebra-Inference

Resource

Citation

Play with Zebra

Step 1: Enviorment

Step 2: Download the model from HF-models

Step 3: Launch the Zebra

Step 4: Open Web Browser to play with Zebra

LICENSE

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Languages