-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
could not load model - all backends returned error #1037
Comments
I'm having same problem on ubuntu 23.04. Same exact issue where gallery downloading didn't download model, getting all the same rpc errors as you. I disabled ufw and reloaded the container, the model loaded, and is receiving requests, but not responding to anything. This is even using the ggml-gpt4all-j model from the getting started docs. I have tried multiple llama-2-7b-chat.ggmlv3 as well all same result. Here are the logs when i managed to get gpt4all-j loaded, but didnt respond to any requests, with some of the rpc errors
EDIT : It is something with Ubuntu or linux, same exact setup was followed but on windows 11 and it runs fine, same model (llama-2-7b-chat.ggmlv3.q4_K_M.bin), gpu and steps followed in install. |
Hi, guys. Thanks for your feedback. For @aaron13100, the issue maybe the model is not complete. I saw the service cannot load the model I suggest you to use some easy example from gallery to do a test first. Make sure everything is fine and then you can try some customise models. For @Mafyuh, I saw the log that everything goes well. Do you use GPU on Ubuntu? If only CPU, how long time you wait for the request? |
And for the content of the log.I know the log maybe let you are confused little bit. Here is an issue related to it, #1076. |
LocalAI version:
According to git the last commit is from Sun Sep 3 02:38:52 2023 -0700 and says "added Linux Mint"
Environment, CPU architecture, OS, and Version:
Linux instance-7 6.2.0-1013-gcp #13-Ubuntu SMP Tue Aug 29 23:07:20 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
I gave the VM 8 cores and 64gigs of ram. Ubuntu 23.04.
Describe the bug
To Reproduce
I tried to specify the model at https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML/tree/main. The model does appear using the curl http://localhost:8080/models/available function and does start downloading that way. The download didn't complete so I downloaded the file separately and placed it in the /models directory.
I then used
but get an error instead of a response. I also tried
and
Expected behavior
Some kind of answer from the model and a non-error message.
Logs
Client side:
The file does exist. I added some symbolic links at build/models/llama-2-70b-chat.ggmlv3.q5_K_M.bin and /build/models/llama-2-70b-chat.ggmlv3.q5_K_M.bin and the errors at the end changed a bit.
Server side:
The log is quite long and I'm not sure what to include, but it looks like it's going through various ways to try to load the model and they all fail.
Etc
Maybe there's a different file/format I'm supposed to use?
It does load and run the example from the docs, wizardlm-13b-v1.0-superhot-8k.ggmlv3.q4_K_M.bin.
thanks
The text was updated successfully, but these errors were encountered: