-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: GGUF of Llama 3 8B appears to use smaug-bpe pretokenizer? #7724
Comments
Should the pretokenizer not be llama-bpe? |
It appears this happens when the creator uses the wrong tokenizer config: You can fix it using this script:
|
Confirmed that the following works with the version I pulled yesterday, and the corrected GGUF will load in current ooba. That still leaves a question of why smaug-bpe was selected for Llama 3 8B instead of llama-bpe. It seems the decision arises in convert-hf-to-gguf.py and there is no apparent option to override that with a command-line option.
|
Let me close the issue since it happens on the model's creator side. |
Sure. I have a more focused question regarding convert-hf-to-gguf.py and its choice of default pre-tokenizer, but I will explore the script more before asking it. |
What happened?
Although running convert_hf_convert.py and then quantize completed (without errors) and appears to generate GGUFs of the correct size for Llama 3 8B, they appear to be of pretokenizer smaug-bpe. They will not load in current ooba. Unsure if this is a temporary mismatch.
Example of broken quants here:
https://huggingface.co/grimjim/Llama-3-Luminurse-v0.1-OAS-8B-GGUF/tree/main
I tried two methods, but the incompatible result happened for both cases:
python llama.cpp/convert-hf-to-gguf.py ./text-generation-webui/models/xp98-8B --outfile temp.gguf --outtype f32
llama.cpp\build\bin\release\quantize temp.gguf ./text-generation-webui/models/xp98b.Q8_0.gguf q8_0
python llama.cpp/convert-hf-to-gguf.py ./text-generation-webui/models/xp98-8B --outfile temp.gguf --outtype bf16
llama.cpp\build\bin\release\quantize temp.gguf ./text-generation-webui/models/xp98b.Q8_0.gguf q8_0
Name and Version
version: 3070 (3413ae2)
built with MSVC 19.39.33521.0 for x64
What operating system are you seeing the problem on?
No response
Relevant log output
No response
The text was updated successfully, but these errors were encountered: