Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support starcoder2 #1587

Closed
rburgstaller opened this issue Feb 29, 2024 · 20 comments
Closed

support starcoder2 #1587

rburgstaller opened this issue Feb 29, 2024 · 20 comments
Assignees
Labels

Comments

@rburgstaller
Copy link

https://techcrunch.com/2024/02/28/starcoder-2-is-a-code-generating-ai-that-runs-on-most-gpus

https://github.com/bigcode-project/starcoder2

@wsxiaoys wsxiaoys transferred this issue from TabbyML/registry-tabby Feb 29, 2024
@wsxiaoys wsxiaoys added good first issue Good for newcomers enhancement New feature or request labels Feb 29, 2024
@wsxiaoys
Copy link
Member

ref: #1230

@wsxiaoys
Copy link
Member

also pending on ggerganov/llama.cpp#5795

@wsxiaoys
Copy link
Member

wsxiaoys commented Mar 2, 2024

added https://github.com/wsxiaoys/registry-tabby, runnable with --model wsxiaoys/StarCoder-3B for nightly build.

@aaronstevenson408
Copy link

@wsxiaoys having trouble getting it to run , i can run the normal models on the nightly build , but for some reason can't get starcoder2 to work, used your repo and my fork , also have tried manually downloading it.

https://huggingface.co/brittlewis12/starcoder2-3b-GGUF
https://huggingface.co/second-state/StarCoder2-3B-GGUF
are the repos i've gotten from.
keeps erroring 1 , but that isn't enough to go off of
i'm running a docker nightly image updated 8 hours ago
any ideas on getting it to run , i'm excited to get it running

@wsxiaoys
Copy link
Member

wsxiaoys commented Mar 2, 2024

Hey @aaronstevenson408 , could you try this docker images ghcr.io/tabbyml/tabby:main-ef15f97? Just realized the docker tag for nightly is not updated yet.

@aaronstevenson408
Copy link

Seems like it worked ty

@rudiservo
Copy link

Is the prompt_template the same has starcoder 1?
It seems to have issues to autocomplete in the middle of writing a word and only triggering on new line.

@CleyFaye
Copy link

CleyFaye commented Mar 4, 2024

added https://github.com/wsxiaoys/registry-tabby, runnable with --model wsxiaoys/StarCoder-3B for nightly build.

I am not sure if here is the right place to report this, but the three size for StarCoder2 all have the name "StarCoder2-3B" there.

@wsxiaoys
Copy link
Member

wsxiaoys commented Mar 4, 2024

added https://github.com/wsxiaoys/registry-tabby, runnable with --model wsxiaoys/StarCoder-3B for nightly build.

I am not sure if here is the right place to report this, but the three size for StarCoder2 all have the name "StarCoder2-3B" there.

Thanks for reporting - fixed

@wsxiaoys wsxiaoys added fixed-in-next-release and removed good first issue Good for newcomers labels Mar 6, 2024
@wsxiaoys wsxiaoys self-assigned this Mar 6, 2024
@wsxiaoys wsxiaoys closed this as completed Mar 7, 2024
@xunfeng1980
Copy link

exit ....

docker run --rm -it -p 8001:8000  --gpus='"device=7"' -v /mnt/data/tabby-data:/data  tabbyml/tabby:main-60310d4 serve --model /data/starcoder2-7b --device cuda
2024-03-11T10:28:20.357969Z  INFO tabby::services::model: crates/tabby/src/services/model/mod.rs:121: Loading model from local path /data/starcoder2-7b
2024-03-11T10:28:20.358014Z  INFO tabby::serve: crates/tabby/src/serve.rs:118: Starting server, this might take a few minutes...
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes

@xunfeng1980
Copy link

出口 ....

docker run --rm -it -p 8001:8000  --gpus='"device=7"' -v /mnt/data/tabby-data:/data  tabbyml/tabby:main-60310d4 serve --model /data/starcoder2-7b --device cuda
2024-03-11T10:28:20.357969Z  INFO tabby::services::model: crates/tabby/src/services/model/mod.rs:121: Loading model from local path /data/starcoder2-7b
2024-03-11T10:28:20.358014Z  INFO tabby::serve: crates/tabby/src/serve.rs:118: Starting server, this might take a few minutes...
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes

GGUF....

@jeromepl
Copy link

jeromepl commented Apr 3, 2024

Hi @wsxiaoys, is it now possible to run StarCoder v2 from tabby directly with something like:
tabby serve --device metal --model TabbyML/StarCoderV2-3B? If not, what is the best way to run StarCoder v2 today?

@rudiservo
Copy link

@jeromepl you can you can either make you own repo registry-tabby or use one already available
https://github.com/wsxiaoys/registry-tabby

tabby serve --device metal --model wsxiaoys/StarCoder2-15B

@RiQuY
Copy link

RiQuY commented Apr 8, 2024

Is this still pending to be implemented on the official model registry?

@wsxiaoys
Copy link
Member

wsxiaoys commented Apr 8, 2024

Just added the Starcoder2-3B / Starcoder2-7B to the official registry. Enjoy!

@rudiservo
Copy link

@wsxiaoys could you add the 15B? It's actually quite usable on a 4090 or a 7900XTX.

@wsxiaoys
Copy link
Member

wsxiaoys commented Apr 9, 2024

@rudiservo you could still maintain that in a forked registry. For tabby official registry we prefer not to include models go beyond 10B atm.

@rudiservo
Copy link

@wsxiaoys yes I already have one, I was just suggesting it since more people might be interested in it and given my experience with the model, also the quality between 7B and 15B is a bit noticeable.

It is quite usable with Tabby, but you do need more than 16GB of VRAM, currently the 7900XTX can handle it quite well.

@den-run-ai
Copy link

I see StarCoder2-7B and StarCoder2-15B in the Tabby leaderboard, but only 7B is available on Mac:

tabby serve --device metal --model StarCoder2-15B
thread 'main' panicked at crates/tabby-common/src/registry.rs:92:32:
Invalid model_id <TabbyML/StarCoder2-15B>
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

@den-run-ai
Copy link

It looks like the StarCoder2 15B quantized model is available in HuggingFace:

https://huggingface.co/nold/starcoder2-15b-GGUF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants