Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stability AI's Stable Code 3B support #1230

Closed
clickclack777 opened this issue Jan 17, 2024 · 18 comments
Closed

Stability AI's Stable Code 3B support #1230

clickclack777 opened this issue Jan 17, 2024 · 18 comments
Labels

Comments

@clickclack777
Copy link

clickclack777 commented Jan 17, 2024

Please describe the feature you want
Please add Stability AI's Stable Code 3B
https://huggingface.co/stabilityai/stable-code-3b

Please reply with a 馃憤 if you want this feature.

@clickclack777 clickclack777 added the enhancement New feature or request label Jan 17, 2024
@wsxiaoys
Copy link
Member

wsxiaoys commented Jan 17, 2024

Since stability ai folks provides gguf quantization, it's easy to integrate by following: https://slack.tabbyml.com/Gd5zV1P69JN/how-can-i-indicate-a-custom-model-to-tabbyml

@wsxiaoys wsxiaoys added the good first issue Good for newcomers label Jan 17, 2024
@clickclack777
Copy link
Author

can I point to a local gguf file like this? how about the "sha256"?

Screenshot 2024-01-17 at 23 02 17

@wsxiaoys
Copy link
Member

wsxiaoys commented Jan 17, 2024

For running tabby on downloaded model, you could refer https://github.com/TabbyML/tabby/blob/main/MODEL_SPEC.md directly

@clickclack777
Copy link
Author

clickclack777 commented Jan 17, 2024

Can the gguf file be located elsewhere other than in the Tabby model file folder? Want to use the gguf file LM Studio has already downloaded to save disk space.

@wsxiaoys
Copy link
Member

wsxiaoys commented Jan 17, 2024

Yes, but you still need to organize it into a directory format specified in model spec.

Want to use the gguf file LM Studio has already downloaded to save disk space.

Shall be good to create a symbolic link?

@clickclack777
Copy link
Author

clickclack777 commented Jan 17, 2024

So if I:

  1. Setup a folder called "StableCode-3B" with a sub-folder "ggml" and place the symbolic link here.
  2. Copy & paste another "tabby.json" file into "StableCode-3B" folder and then modify the model name, should the url link be to the symbolic file? How about the "sha256" value?
  3. Model.json url also point to the symbolic link.

it should work?

@clickclack777
Copy link
Author

Apparently not. Deletes model rows to all local file.

thread 'main' panicked at crates/tabby-common/src/registry.rs:87:9:
Invalid model id TabbyML/StableCode-3B/
stack backtrace:
0: 0x105b4d720 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h6d4268b2ed62fb94
1: 0x105b7115c - core::fmt::write::h5d55d44549819258
2: 0x105b49f50 - std::io::Write::write_fmt::hc515897f91abd6cf
3: 0x105b4d560 - std::sys_common::backtrace::print::h2c300c1ebedfc73c
4: 0x105b4ee4c - std::panicking::default_hook::{{closure}}::h0aa9be5c44269370
5: 0x105b4eb78 - std::panicking::default_hook::h2c0ef097934ee9e6
6: 0x105b4f394 - std::panicking::rust_panic_with_hook::h84c8637cb6e56008
7: 0x105b4f2a0 - std::panicking::begin_panic_handler::{{closure}}::h25482adda06c7b7f
8: 0x105b4dbac - std::sys_common::backtrace::__rust_end_short_backtrace::h0c6f3beb22324a29
9: 0x105b4f004 - _rust_begin_unwind
10: 0x105c5d0f4 - core::panicking::panic_fmt::h9072a0246ecafd14
11: 0x10554b800 - tabby_common::registry::parse_model_id::h36a479eafd05fc23
12: 0x104b80154 - tabby_download::download_model::{{closure}}::h9eebbf65aa472130
13: 0x104b9fc14 - tabby::services::model::download_model_if_needed::{{closure}}::hbeb5a21d1fa4180a
14: 0x104ba0128 - tabby::serve::main::{{closure}}::h7cb653b915a7cc63
15: 0x104b95f68 - tokio::runtime::runtime::Runtime::block_on::h63288b1efbfe5842
16: 0x104ca2ffc - tabby::main::ha5dc08e503a2bd27
17: 0x104b89c38 - std::sys_common::backtrace::__rust_begin_short_backtrace::h0b5db4848f3c85bc
18: 0x104d520b4 - std::rt::lang_start::{{closure}}::h42a0649d95a186a0
19: 0x105b42a54 - std::rt::lang_start_internal::hadaf077a6dd0140b
20: 0x104ca30fc - _main

@wsxiaoys
Copy link
Member

Hi, could you share:

  1. the structure of local model dir (maybe output of find)
  2. the command you used invoking tabby

@clickclack777
Copy link
Author

clickclack777 commented Jan 18, 2024

URL to original file
file:///Users/click/.cache/lm-studio/models/TheBloke/deepseek-coder-6.7B-instruct-GGUF/deepseek-coder-6.7b-instruct.Q8_0.gguf

URL to symbolic link
file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf

RUST_BACKTRACE=full tabby serve --device metal --model TabbyML/DeepseekCoder-6.7B/

@wsxiaoys
Copy link
Member

wsxiaoys commented Jan 18, 2024

  1. it seems your directory structure doesn't follow https://github.com/TabbyML/tabby/blob/main/MODEL_SPEC.md
  2. When passing local directory to --model, please use the absolute full path as argument

@fungiboletus
Copy link

I'm not able to load the model.

I tried the q8 quantised files of both https://huggingface.co/brittlewis12/stable-code-3b-GGUF and https://huggingface.co/TheBloke/stable-code-3b-GGUF but llama.cpp is unable to load the file.

$ TABBY_DISABLE_USAGE_COLLECTION=1 tabby serve --device metal --model /Users/fungiboletus/Desktop/StableCode-3B
2024-01-18T08:07:16.405013Z  INFO tabby::services::model: crates/tabby/src/services/model.rs:80: Loading model from local path /Users/fungiboletus/Desktop/StableCode-3B
2024-01-18T08:07:16.405333Z  INFO tabby::serve: crates/tabby/src/serve.rs:111: Starting server, this might takes a few minutes...
2024-01-18T08:07:16.420777Z ERROR llama_cpp_bindings: crates/llama-cpp-bindings/src/lib.rs:62: Unable to load model: /Users/fungiboletus/Desktop/StableCode-3B/ggml/q8_0.v2.gguf

$ ls -R /Users/fungiboletus/Desktop/StableCode-3B   
ggml		tabby.json

./ggml:
q8_0.v2.gguf

I also tried to convert the GGUF model again using llama.cpp, in case the format has changed, but this wasn't helpful.

$ /Users/fungiboletus/Desktop/llama.cpp/quantize q8_0.v2.old.gguf q8_0.v2.gguf COPY

@clickclack777
Copy link
Author

clickclack777 commented Jan 18, 2024

file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf

  1. How should I structure it otherwise? Please explain in comprehensive steps with reference to the URLs I've provided.

  2. This doens't work if that is what you were referring to "RUST_BACKTRACE=full tabby serve --device metal --model file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf"

thread 'main' panicked at crates/tabby-common/src/registry.rs:87:9:
Invalid model id file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf
stack backtrace:
0: 0x103c39720 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h6d4268b2ed62fb94
1: 0x103c5d15c - core::fmt::write::h5d55d44549819258
2: 0x103c35f50 - std::io::Write::write_fmt::hc515897f91abd6cf
3: 0x103c39560 - std::sys_common::backtrace::print::h2c300c1ebedfc73c
4: 0x103c3ae4c - std::panicking::default_hook::{{closure}}::h0aa9be5c44269370
5: 0x103c3ab78 - std::panicking::default_hook::h2c0ef097934ee9e6
6: 0x103c3b394 - std::panicking::rust_panic_with_hook::h84c8637cb6e56008
7: 0x103c3b2a0 - std::panicking::begin_panic_handler::{{closure}}::h25482adda06c7b7f
8: 0x103c39bac - std::sys_common::backtrace::__rust_end_short_backtrace::h0c6f3beb22324a29
9: 0x103c3b004 - _rust_begin_unwind
10: 0x103d490f4 - core::panicking::panic_fmt::h9072a0246ecafd14
11: 0x103637800 - tabby_common::registry::parse_model_id::h36a479eafd05fc23
12: 0x102c6c154 - tabby_download::download_model::{{closure}}::h9eebbf65aa472130
13: 0x102c8bc14 - tabby::services::model::download_model_if_needed::{{closure}}::hbeb5a21d1fa4180a
14: 0x102c8c128 - tabby::serve::main::{{closure}}::h7cb653b915a7cc63
15: 0x102c81f68 - tokio::runtime::runtime::Runtime::block_on::h63288b1efbfe5842
16: 0x102d8effc - tabby::main::ha5dc08e503a2bd27
17: 0x102c75c38 - std::sys_common::backtrace::__rust_begin_short_backtrace::h0b5db4848f3c85bc
18: 0x102e3e0b4 - std::rt::lang_start::{{closure}}::h42a0649d95a186a0
19: 0x103c2ea54 - std::rt::lang_start_internal::hadaf077a6dd0140b
20: 0x102d8f0fc - _main

@anoldguy
Copy link
Contributor

I get the same as @fungiboletus with my custom registry.

tabby serve --device metal --model anoldguy/StableCode-3B
Writing to new file.
馃幆 Downloaded https://huggingface.co/stabilityai/stable-code-3b/resolve/main/stable-code-3b-Q6_K.gguf to /Users/nathan/.tabby/models/anoldguy/StableCode-3B/ggml/q8_0.v2.gguf.tmp
   00:00:23 鈻曗枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枏 2.14 GiB/2.14 GiB  93.90 MiB/s  ETA 0s.                                                              2024-01-18T14:05:47.674093Z  INFO tabby::serve: crates/tabby/src/serve.rs:111: Starting server, this might takes a few minutes...
2024-01-18T14:05:47.674908Z  INFO tabby::services::code: crates/tabby/src/services/code.rs:53: Index is ready, enabling server...    
2024-01-18T14:05:47.772956Z ERROR llama_cpp_bindings: crates/llama-cpp-bindings/src/lib.rs:62: Unable to load model: /Users/nathan/.tabby/models/anoldguy/StableCode-3B/ggml/q8_0.v2.gguf

I'm using this as the definition, but I'm unsure about the prompt template. 馃

{
    "name": "StableCode-3B",
    "license_name": "STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE",
    "license_url": "https://huggingface.co/stabilityai/stable-code-3b/blob/main/LICENSE",
    "prompt_template": "<fim_prefix>{prefix}<fim_suffix>{suffix}<fim_middle>",
    "provider_url": "https://huggingface.co/stabilityai/stable-code-3b",
    "urls": [
      "https://huggingface.co/stabilityai/stable-code-3b/resolve/main/stable-code-3b-Q6_K.gguf"
    ],
    "sha256": "9749daf176491c33a7318660f1637c97674b0070d81740be8763b2811c495bfc"
  }

@wsxiaoys
Copy link
Member

wsxiaoys commented Jan 18, 2024

Thanks for doing the experiment, it seems the reason that stable lm support is added after our current checkpoint of llama.cpp.

this should be fixed after #434 is done

@SachiaLanlus
Copy link

Thanks for doing the experiment, it seems the reason that stable lm support is added after our current checkpoint of llama.cpp.

this should be fixed after #434 is done

Is there any plan to support Stable Code 3B officially in the future?
Or we still need to use costumed model registry to do the magic?

@wsxiaoys
Copy link
Member

We've bumped llama.cpp version and it has been released in https://github.com/TabbyML/tabby/releases/tag/nightly

Please give it a try to see if it works with StableCode-3B

@HFrost0
Copy link

HFrost0 commented Jan 27, 2024

I tried the nightly and it works for me, my registry

@wsxiaoys wsxiaoys added fixed-in-next-release and removed good first issue Good for newcomers labels Jan 29, 2024
@wsxiaoys wsxiaoys closed this as completed Feb 4, 2024
@wsxiaoys
Copy link
Member

wsxiaoys commented Feb 4, 2024

Fixed in v0.8.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants