-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
only 4 threads are used #498
Comments
good catch, thanks for filing up an issue! that looks a regression, somehow the patch got lost when upstreaming the binding. I've opened up a PR upstream too: nomic-ai/gpt4all#836 |
should your branch itself be executable? I have built a new version with it, but unfortunately comes an error that he can not load the model
|
thanks for the heads up, I've included a fix for this issue in #507 |
thank you, the changes work for me. but unfortunately the output is still extremely slow. if i run this on the same device with the same model and the software from gpt4all, there is an instant response (3seconds), with localai 1minutes, this is a huge difference |
You shouldn't overbook threads, but rather match the number of physical cores. Here I get 120ms per token, but my hardware ain't much capable either |
I have also tried it with 12 cores (actual number of pyhsic cores), but it is the same resultis about 1 minute, until even the first word arrives |
i run it via docker, something seems to go wrong it doesn't look like anything is loaded into memory. when i make a request, nothing really happens for 1 minute, and then the first tokens arrive
|
I get the same tensor error |
LocalAI version:
latest docker image
Environment, CPU architecture, OS, and Version:
Ryzen 9 3900X -> 12 Cores 24 Threads
windows 10 -> wsl (5.15.90.1-microsoft-standard-WSL2 ) docker
Describe the bug
i have the model ggml-gpt4all-l13b-snoozy.bin
but only a maximum of 4 threads are used.
docker-compose.yml
model file
models/gpt-3.5-turbo.yaml
template
e.g. command
thanks
The text was updated successfully, but these errors were encountered: