-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUESTION] I was able to download the LLM in llama.cpp, but I cannot download the LLM in Transformers. #1734
Comments
Python 3.10.13 |
Paste the error from the xinference backend, not just from the commandline. |
OK!
|
Llama-3 is a kind of model that needs to be authenticated. Refer to this: https://inference.readthedocs.io/en/latest/getting_started/troubleshooting.html to set the environment variable. |
Thank you for your answer. |
I want to run 70B on llama-3-instruct using a GPU, so I am trying to run 8B on a GPU first.
I have confirmed that there is already a bug in the gguf model using llama.cpp that prevents the use of GPUs.
So I wanted to use pytorch's model with Transfoemer, but I can't download it.
I think this may have already been reported, or it may be a configuration error on my part, but I would appreciate it if you could help me.
settings
error
The text was updated successfully, but these errors were encountered: