-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for using GGUF tokenizer #345
Conversation
Code Metrics Report=============================================================================== Language Files Lines Code Comments Blanks =============================================================================== Dockerfile 1 34 25 0 9 Happy 1 442 369 0 73 JSON 5 9 9 0 0 Python 21 741 622 21 98 TOML 15 388 351 1 36 ------------------------------------------------------------------------------- Jupyter Notebooks 1 0 0 0 0 |- Markdown 1 60 30 22 8 |- Python 1 96 87 1 8 (Total) 156 117 23 16 ------------------------------------------------------------------------------- Markdown 15 1028 0 761 267 |- BASH 6 205 192 0 13 |- Python 6 121 110 0 11 |- Rust 3 185 172 9 4 (Total) 1539 474 770 295 ------------------------------------------------------------------------------- Rust 84 27992 25630 365 1997 |- Markdown 41 426 0 414 12 (Total) 28418 25630 779 2009 =============================================================================== Total 144 30634 27006 1148 2480 =============================================================================== |
What remains on this PR? Need GGUF tokenizer support so happy to contribute. |
Currently, it doesn't work. In this PR I tried to convert the GGUF tokenizer to a HF tokenizer for easy integration with the rest of mistral.rs, but I ran into some problems with how the decoder/post processor/normalizer parts of the HF tokenizer are being set up. Additionally, it looks like the Mistral GGUF doesn't contain any merges, but the HF tokenizer itself does. I'm not sure if there are sensible defaults or ways to calculate those values from the token types that I can use. So, the current state of this PR is that it is half working. If you could perhaps take a look and see if you can get it to work, that would be amazing! |
What example GGUF are you using for mistral? I don't see any reference to Mistral in ggerganov/ggml |
I'm using this one: https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/tree/main Mistral uses a |
@Jeadie, I made some progress! It mostly works now, and I think there is just one small bug left. With this PR you can run models fully locally, specifying paths for the chat template and GGUF file:
|
This adds support for using a GGUF tokenizer as documented here:
https://github.com/ggerganov/ggml/blob/master/docs/gguf.md#tokenizer