-
Notifications
You must be signed in to change notification settings - Fork 969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code Llama can't find tokenizer #973
Comments
|
Ok, thanks, so I can't use the provided containers directly (1.0.3), but have to build my own? |
@silvanmelchior: https://huggingface.co/blog/codellama#transformers, please directly install |
Thanks for the reply. I am not sure if I fully understand: I use the provided docker container This for example worked with the Llama models, I then got an endpoint (in my case on port 8080) where I could get predictions from the model. However, it does not work with code llama, as the container does not even start because of the missing tokenizer. So does this mean I cannot use the provided containers, but would somehow need to build my own, which has the latest release of |
@ArthurZucker for visibilty (this broke old transformers too, didn't it ?) |
I don't think I understand the issue since:
|
Is this still an issue? I'm running this command on a g5 just fine: docker run --gpus all --shm-size 1g -v /data:/data -p 8080:80 ghcr.io/huggingface/text-generation-inference:1.0.3 --model-id codellama/CodeLlama-34b-Instruct-hf --num-shard 4 |
Then old TGI works as proved by olivier |
I am facing the same error with ghcr.io/huggingface/text-generation-inference:1.0.3. Below are the detailed logs:
|
@silvanmelchior were you able to resolve this at your end? |
@sarthak405, it seems your error is related to the vocab: |
Right @OlivierDehaene , it was indeed a missing tokenizer.model file. Thank you for the quick resolution. |
System Info
TGI 1.0.3, running on Azure "STANDARD_ND40RS_V2"
Information
Tasks
Reproduction
I am running the following command:
docker run --gpus all --shm-size 1g -p 8080:80 -e HUGGING_FACE_HUB_TOKEN=... ghcr.io/huggingface/text-generation-inference:1.0.3 --model-id "codellama/CodeLlama-34b-Instruct-hf"
The container starts, downloads the model, tries to start it, but then fails with the following message:
Tokenizer class "CodeLlamaTokenizer" does not exist or is not currently imported.
Expected behavior
The model starts up and serves requests
The text was updated successfully, but these errors were encountered: