Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama 3 GGUF Tokenizer #350

Closed
sashokbg opened this issue Apr 25, 2024 · 5 comments
Closed

Llama 3 GGUF Tokenizer #350

sashokbg opened this issue Apr 25, 2024 · 5 comments

Comments

@sashokbg
Copy link

Hello, I want to test the new Llama 3 8B model locally but I am unable to make it run using the playground since I cannot find a suitable tokenizer.

I run my server like this:

lmql serve-model llama.cpp:/home/alexander/Games2/models/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf \
  --cuda \
  --port 9999 \
  --n_ctx 4096 \
  --n_gpu_layers 35

and have the following in my playground

from 
    lmql.model("llama.cpp:/home/alexander/Games2/models/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf",
    endpoint="localhost:9999")

But I get the error message that there is no tokenizer available:

File "/home/alexander/projects/lmql/.venv/lib/python3.11/site-packages/lmql/runtime/tokenizer.py", line 366, in tokenizer_not_found_error
    raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier))
lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'huggyllama/llama-7b' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)
App finished running with exit code 1

Any tips on what tokenizer should be used ?

@ChairGraveyard
Copy link

ChairGraveyard commented Apr 25, 2024

Passing it the huggingface ID for the regular (non GGUF/quantized) repo works to get the tokenizer. So for Meta-Llama-3-8B-Instruct you'd pass tokenizer="meta-llama/Meta-Llama-3-8B-Instruct" to your LMQL functions.

from 
    lmql.model("llama.cpp:/home/alexander/Games2/models/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf",
    tokenizer="meta-llama/Meta-Llama-3-8B-Instruct",
    endpoint="localhost:9999")

@sashokbg
Copy link
Author

Hello @ChairGraveyard I put the tokenizer as you proposed and also had to accept meta's license and put my huggingface token in ~/.cache/huggingface/token

Thank you for your help !

@sashokbg
Copy link
Author

Hello, I am coming back to this issue to put some additional info.
It is also required to install "transformers" dependency that allows lmql to download models from hugging face.

pip install transformers

@EdwardSJ151
Copy link

~/.cache/huggingface/token

Hi, I don't have this token folder/file. It is a txt or something similar? How do I add this in?

@sashokbg
Copy link
Author

Just create it and put inside your token from HF

https://huggingface.co/docs/hub/en/security-tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants