Skip to content

Birch-san/huggingface-vscode-endpoint-server

 
 

Repository files navigation

Hugging Face VSCode Endpoint Server

starcoder server for huggingface-vscdoe custom endpoint.

Can't handle distributed inference very well yet.

Usage

See this cool gist for more details on how to use this repository to run a bigcode/starcoder code completion server, with NF4 4-bit quantization (fits into ~11GB VRAM).

pip install -r requirements.txt
python -m main --model_name_or_path bigcode/starcoder --trust_remote_code --bf16

Fill http://localhost:8000/api/generate/ into Hugging Face Code > Model ID or Endpoint in VSCode.

API

curl -X POST http://localhost:8000/api/generate/ -d '{"inputs": "", "parameters": {"max_new_tokens": 64}}'
# response = {"generated_text": ""}

Acknowledgements

Includes MIT-licensed code copied from Artidoro Pagnoni's qlora, and Apache-licensed code copied from MosaicML's mpt-7b-chat Huggingface Space.

About

starcoder server for huggingface-vscode custom endpoint

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 100.0%