-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nvidia 555 driver does not work with Ollama #4563
Comments
Interestingly, I experienced a similar phenomenon when I upgraded from driver version 535 to 550 - my CPU usage remained high until I rebooted the host machine. |
What's the output of |
Please add |
I don't believe it's related to Ollama. I also had this issue. I discovered it when setting up a container unrelated to Ollama. It's the new 555 drivers, and affects any cuda/gpu related container (I tested multiple including base pytorch cuda docker image). For example, trying to list Cuda capability in pytorch docker gives a "CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error?" Nvidia computer container benchmark said 1 device requested, 0 available. Issue was instantly fixed by reverting to prior drivers. |
Thank you very much @brodieferguson ! this seemed to do the trick. Nevertheless, I am not quite happy to downgrade my GPU drivers in order to make Ollama work. For that reason I wouldn't consider this issue resolved and I will cooperate to provide more info to solve this problem in case it is needed. @dhiltgen I just downgraded my drivers to the immediate version and Ollama started to use the GPU (RTX4070ti) instantly: There might be an incompatibility with the new drivers as @brodieferguson had the same problem and probably many other users too. Thank you very much both. I hope I can enjoy Ollama with the latest drivers soon. |
same issue here:
Later I noticed that ollama now no longer uses my GPU, noticing it was much slower and looking at resources there GPU memory was not used. Using the newly available ollama ps command confirmed the same thing: nvidia-smi clearly showed GPU is available: +-----------------------------------------------------------------------------------------+ I will downgrade drivers as well, but there clearly is a issue with ollama and these drivers |
confirmed, reverted to 552.42 drivers, GPU is now used again. |
@nerdpudding thanks for your contribution. We can confirm this is not an isolated issue and nvidia drivers 555.85 causes ollama not to use the gpu for some reason. |
Can confirm, no CUDA docker image works with 555. downgrading to 552 fixes the issue. This is unrelated to ollama and needs to be fixed by docker/nvidia. |
Hi folks it seems the 555 Nvidia driver branch is not working with Ollama (and other projects that integrate llama.cpp). We're working to resolve this together – in the meantime downgrading to a prior version will fix the issue. So sorry about this and will post more updates here. |
Hi all, this seems to be from the
to load them manually. A fix is coming with the install script. Also, adding:
to |
Hi folks it seems this is from the new driver packages not loading the
and then to keep it loaded, edit the conrfig for
|
@jmorganca Thanks a lot for the fix! 💙 |
Not sure how to apply these on WLS docker installation. I downgraded my driver for now and it is working again. |
Same issue here. The modprobe nvidia fix doesn't seem to work with WSL 2. I’m not an expert, but I tried different Docker builds using the install script with pytorch:2.3.0-cuda12.1-cudnn8-runtime as the base image. Everything works fine with older NVIDIA drivers, but not with 555.85. Even though nvidia-smi shows the GPU is available (both in WSL and the running container), Ollama defaults to CPU. The debug logs look like this: time=2024-06-03T11:59:56.076Z level=DEBUG source=gpu.go:355 msg="Unable to load nvcuda" library=/usr/lib/x86_64-linux-gnu/libcuda.so.1 error="nvcuda init failure: 500" Any ideas or suggestions on what might be causing this with the latest drivers? Should we just await if a newer nvidia driver release fixes it? |
@nerdpudding Did you try using Ollama v0.1.41 (the latest release) ? |
Yes. First I tried pulling the lates official docker image. Use the official PyTorch image with CUDA supportFROM pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime Install dependenciesRUN apt-get update && apt-get install -y Create a non-root userRUN useradd -m -s /bin/bash ollama Install OllamaRUN curl -fsSL https://ollama.com/install.sh | sh Ensure the volumes are correctly set upVOLUME ["/root/.ollama"] Set environment variables for CUDA and debuggingENV OLLAMA_DEBUG=1 Ensure correct library linksRUN sudo mkdir -p /usr/local/cuda/lib64/stubs && sudo ln -s /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/local/cuda/lib64/stubs/libcuda.so Run OllamaCMD ["ollama", "serve"] This works fine with the old drivers, not with the new. |
@nerdpudding Are you using Docker Desktop? smi worked for me too, but not the rest. NVIDIA/nvidia-container-toolkit#520
|
Yes, Docker Desktop with Windows 11 and WSL2 indeed. Older drivers still work fine. I missed that the workaround with the 'nvidia_uvm' was not possible with docker desktop and I was (pointlessly) trying to look for a something that does...Thanks for pointing out they are still working on a solution for Docker Desktop users, I'll just be more patient :-) |
Is there a separate issue for that (this one got closed) ? I would like to keep my eye on it. then I would know when to update :D |
Not sure, but to my understanding, it is a NVIDIA issue so... https://www.nvidia.com/en-us/geforce/forums/game-ready-drivers/13/543951/geforce-grd-55599-feedback-thread-released-6424/ I just installed, rebooted and tested with that newer driver using both my own docker file (which uses pytorch:2.3.0-cuda12.1-cudnn8-runtime and then installs ollama using the install.sh script) and the latest official Ollama Docker image. Both still only use the CPU. I reverted back to 551.44 and immediately GPU was used again. So apparently that 'general fix' still does not apply to WLS2 with Docker Desktop yet then, but only for CE maybe. I'm not sure if there is another open issue on it here, but I guess it is a NVIDIA issue, so we probably just have to look on forum/threads there and wait until they release a fix with newer drivers. |
I subscribed to this one
|
Docker has released an update for Docker Desktop. See https://docs.docker.com/desktop/release-notes/ I just tested it and GPU is used now with Nvidia drivers 555.99 after upgrading Docker Desktop to 4.31.0 This fixed it for me! So if you are using Docker on Windows with WSL 2 (Now not only for Docker CE, but also Docker Desktop), after updating, it will work again. |
What is the issue?
I just updated nvidia drivers in my host to this version. I have a RTX4070ti.
![imagen](https://private-user-images.githubusercontent.com/12192256/332588174-5eb48002-a7dd-4d1e-bc9f-f64564a216c3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE5OTEyODAsIm5iZiI6MTcyMTk5MDk4MCwicGF0aCI6Ii8xMjE5MjI1Ni8zMzI1ODgxNzQtNWViNDgwMDItYTdkZC00ZDFlLWJjOWYtZjY0NTY0YTIxNmMzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzI2VDEwNDk0MFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTA2ZTQ2YmFiODFjYWZhODc2MzVlMTViOGVlZDk3NjNmZjJhY2I4YzBiNjU5ODEzZjE0ZTFkNmNiODc0YjMzMzgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.GGfK0ZfkZxIut2HOIWLsaP9dPK_Frko6-9JuUbPs2_4)
Then, when I run ollama inside my container (my container is running Ubuntu 20.04), Ollama is not using the GPU (I can tell because it is at 1% when responding)
This is my log when running Ollama
Edit:
In addition, my container detects successfully the gpu passthrough when doing nvidia-smi.
OS
Docker
GPU
Nvidia
CPU
AMD
Ollama version
0.1.38
The text was updated successfully, but these errors were encountered: