Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jetson Nano Orin: libnvidia-ml.so does not found, msg="no GPU detected" #3098

Closed
gab0220 opened this issue Mar 13, 2024 · 12 comments
Closed
Assignees
Labels
nvidia Issues relating to Nvidia GPUs and CUDA

Comments

@gab0220
Copy link

gab0220 commented Mar 13, 2024

Hello everyone! I'm using a Jetson Nano Orin to run Ollama.

  • I'm using a jetson containers dustynv/langchain:r35.3.1.
  • To run this container :
    docker run --it --runtime=nvidia --gpus 'all,"capabilities=graphics,compute,utility,video,display" --net host --name ollama -e NVIDIA_VISIBLE_DEVICES=all -e $DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --privileged dustynv/langchain:r35.3.1
    To verify the availability of GPU within the container, I ran CUDA-Samples, and the tests passed successfully.
  • Inside the container, I installed Ollama using curl -fsSL https://ollama.com/install.sh | sh.

The first time I did this, there was an error output during the installation:

...
>>> Installing NVIDIA repository ...
curl: (22) The requested URL returned error: 404
...
>>> Install complete. Run "ollama" from the command line.

I resolved this error by following #2302.

When I run ollama serve I have:

time=2024-03-13T09:14:02.769Z level=INFO source=images.go:710 msg="total blobs: 0"
time=2024-03-13T09:14:02.769Z level=INFO source=images.go:717 msg="total unused blobs removed: 0"
time=2024-03-13T09:14:02.770Z level=INFO source=routes.go:1021 msg="Listening on 127.0.0.1:11434 (version 0.1.28)"
time=2024-03-13T09:14:02.770Z level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..."
time=2024-03-13T09:14:07.602Z level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [cuda_v11 cpu]"
time=2024-03-13T09:14:07.602Z level=INFO source=gpu.go:94 msg="Detecting GPU type"
time=2024-03-13T09:14:07.602Z level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so"
time=2024-03-13T09:14:07.607Z level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
time=2024-03-13T09:14:07.607Z level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so"
time=2024-03-13T09:14:07.608Z level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
time=2024-03-13T09:14:07.608Z level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
time=2024-03-13T09:14:07.608Z level=INFO source=routes.go:1044 msg="no GPU detected"

I attempted: find / -name "*libnvidia-ml*"
/usr/local/cuda-11.4/targets/aarch64-linux/lib/stubs/libnvidia-ml.so

If I copy or move this library into one of the directories where Ollama searches for it and then run ollama serve:

time=2024-03-13T09:26:04.586Z level=INFO source=images.go:710 msg="total blobs: 0"
time=2024-03-13T09:26:04.587Z level=INFO source=images.go:717 msg="total unused blobs removed: 0"
time=2024-03-13T09:26:04.587Z level=INFO source=routes.go:1021 msg="Listening on 127.0.0.1:11434 (version 0.1.28)"
time=2024-03-13T09:26:04.587Z level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..."
time=2024-03-13T09:26:09.371Z level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [cuda_v11 cpu]"
time=2024-03-13T09:26:09.371Z level=INFO source=gpu.go:94 msg="Detecting GPU type"
time=2024-03-13T09:26:09.371Z level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so"
time=2024-03-13T09:26:09.375Z level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/usr/lib/libnvidia-ml.so]"

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
WARNING:

You should always run with libnvidia-ml.so that is installed with your
NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64.
libnvidia-ml.so in GDK package is a stub library that is attached only for
build purposes (e.g. machine that you build your application doesn't have
to have Display Driver installed).
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
time=2024-03-13T09:26:09.377Z level=INFO source=gpu.go:323 msg="Unable to load CUDA management library /usr/lib/libnvidia-ml.so: nvml vram init failure: 9"
time=2024-03-13T09:26:09.377Z level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so"
time=2024-03-13T09:26:09.378Z level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
time=2024-03-13T09:26:09.378Z level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
time=2024-03-13T09:26:09.378Z level=INFO source=routes.go:1044 msg="no GPU detected"

I also attempted to use the Ollama container, but encountered the same result.

How can I resolve this issue?

@BruceMacD BruceMacD added the nvidia Issues relating to Nvidia GPUs and CUDA label Mar 13, 2024
@remy415
Copy link
Contributor

remy415 commented Mar 25, 2024

@gab0220 Hello, sorry I missed your issue. I've been working on a patch for this which was just merged today in #2279. The next major release should include the changes that re-enable support for Jetsons. If you would like to run it sooner, you should pull the latest Ollama commit and compile it locally.

Note: Jetsons + GPU acceleration + containers is a bit weird compared to other machines. I would advise you check out dusty-nv on Github if you're interested in learning how to get containers working properly on Jetson devices. In the mean time, I am not sure the Ollama container is built to support Jetsons and I haven't personally worked on it yet (coming soon).

In the mean time, you should be able to pull the repo and build locally for it to run. Check the documentation for more detailed instructions, and feel free to ping me if you run into any issues.

@remy415
Copy link
Contributor

remy415 commented Mar 25, 2024

I attempted: find / -name "libnvidia-ml"
/usr/local/cuda-11.4/targets/aarch64-linux/lib/stubs/libnvidia-ml.so

The reason libnvidia-ml.so doesn't exist on Jetson devices is because NVidia didn't include the NVidia ML library in Jetpack prior to JP 6. What you had there is a "stub" library, which exists purely so you can compile with nvml support without having nvml library in your driver (to use the binary on another machine). The "stub" doesn't actually contain any code or implementation for code, so it won't work for the purposes of this project on devices running JP < 6.

@gab0220
Copy link
Author

gab0220 commented Mar 26, 2024

Hello @remy415, thank you for your response.

I've upgraded the Jetpack to JP6, ensuring that libnvidia-ml.so is now correctly located in its designated directory. Following the manual installation steps outlined in the Linux installation guide and the accompanying tutorial for NVIDIA Jetson Devices, I successfully installed Ollama. The outcome yielded:

Mar 26 09:41:03 ubuntu systemd[1]: Started Ollama Service.
Mar 26 09:41:03 ubuntu ollama[964]: time=2024-03-26T09:41:03.379+01:00 level=INFO source=images.go:806 msg="total blobs: 5"
Mar 26 09:41:03 ubuntu ollama[964]: time=2024-03-26T09:41:03.382+01:00 level=INFO source=images.go:813 msg="total unused blobs removed: 0"
Mar 26 09:41:03 ubuntu ollama[964]: time=2024-03-26T09:41:03.382+01:00 level=INFO source=routes.go:1110 msg="Listening on [127.0.0.1:11434](https://127.0.0.1:11434/) (version 0.1.29)"
Mar 26 09:41:03 ubuntu ollama[964]: time=2024-03-26T09:41:03.386+01:00 level=INFO source=payload_common.go:112 msg="Extracting dynamic libraries to /tmp/ollama1468253564/runners ..."
Mar 26 09:41:08 ubuntu ollama[964]: time=2024-03-26T09:41:08.954+01:00 level=INFO source=payload_common.go:139 msg="Dynamic LLM libraries [cpu cuda_v11]"
Mar 26 09:41:08 ubuntu ollama[964]: time=2024-03-26T09:41:08.954+01:00 level=INFO source=gpu.go:77 msg="Detecting GPU type"
Mar 26 09:41:08 ubuntu ollama[964]: time=2024-03-26T09:41:08.956+01:00 level=INFO source=gpu.go:191 msg="Searching for GPU management library [libnvidia-ml.so](https://libnvidia-ml.so/)"
Mar 26 09:41:08 ubuntu ollama[964]: time=2024-03-26T09:41:08.964+01:00 level=INFO source=gpu.go:237 msg="Discovered GPU libraries: [/usr/lib/[libnvidia-ml.so](https://libnvidia-ml.so/)]"
Mar 26 09:41:08 ubuntu ollama[964]: time=2024-03-26T09:41:08.977+01:00 level=INFO source=gpu.go:82 msg="Nvidia GPU detected"
Mar 26 09:41:08 ubuntu ollama[964]: time=2024-03-26T09:41:08.977+01:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
Mar 26 09:41:08 ubuntu ollama[964]: time=2024-03-26T09:41:08.988+01:00 level=INFO source=gpu.go:109 msg="error looking up CUDA GPU memory: device memory info lookup failure 0: 3"
Mar 26 09:41:08 ubuntu ollama[964]: time=2024-03-26T09:41:08.988+01:00 level=INFO source=routes.go:1133 msg="no GPU detected"
Mar 26 09:58:27 ubuntu systemd[1]: Stopping Ollama Service...
Mar 26 09:58:27 ubuntu systemd[1]: ollama.service: Deactivated successfully.
Mar 26 09:58:27 ubuntu systemd[1]: Stopped Ollama Service.
Mar 26 09:58:27 ubuntu systemd[1]: ollama.service: Consumed 8.728s CPU time.

I'll keep you informed.

@remy415
Copy link
Contributor

remy415 commented Mar 26, 2024

Following the manual installation steps outlined in the Linux installation guide and the accompanying tutorial for NVIDIA Jetson Devices, I successfully installed Ollama.

Their current release is 0.1.29 from two weeks ago, this is before the Jetson support merge was done. If you want to run this now, you'll have to compile the binary yourself on the system or wait for the release of 0.1.30. Please note that Jetpack already comes with the NVidia toolkit installed.

To do this, following the instructions here:

  1. Ensure the following is installed:
cmake version 3.24 or higher
go version 1.22 or higher
gcc version 11.4.0 or higher
  1. Ensure the LD_LIBRARY_PATH variable is set with your CUDA library paths. This is typically export LD_LIBRARY_PATH=/usr/local/cuda/lib64 but may vary if your system has been reconfigured.
  2. Ensure you have git CLI tool installed, see github documentation.
  3. Clone the github repo:
cd <project base directory>
git clone https://github.com/ollama/ollama.git
cd ollama
  1. Build the binary. Note: setting export OLLAMA_SKIP_CPU_GENERATE="1" will speed up the build time for you as the ARM CPUs don't have the AVX extensions anyway.
    Note the 3 dots in go generate ./...
go generate ./... && go build .

You should now be able to run Ollama with ./ollama serve and in another terminal ./ollama run <model>

@gab0220
Copy link
Author

gab0220 commented Mar 26, 2024

Hello @remy415, thank you for your support.

I've compile and build ollama using dfc6721b203fdf2a91f022f61170d26306dbae63 commit. It works!!!

I also created a new Modelfile to allocate more layers on GPU.

FROM dolphin-phi
PARAMETER num_gpu 33

This is the result running this model and using jtop:

photo_5771652798864277357_y
photo_5771652798864277356_y

Note: I'm using
Screenshot from 2024-03-26 14-56-42

Thank you so much!

@remy415
Copy link
Contributor

remy415 commented Mar 26, 2024

Just a note: You shouldn't need to adjust the num_gpu parameter as it should be configured automatically. You should be able to just use the default dolphin model. If there's an issue where not all layers are offloaded, that means that either the model is too big, or there's an underlying issue with the application.

@dhiltgen
Copy link
Collaborator

dhiltgen commented May 2, 2024

@remy415 how are we looking on Jetson's in 0.1.33? If you set LD_LIBRARY_PATH to point to the hosts CUDA lib dir, are we still having problems?

@remy415
Copy link
Contributor

remy415 commented May 2, 2024

@dhiltgen last time I built it, it worked just fine. I'll give it another run now

@remy415
Copy link
Contributor

remy415 commented May 3, 2024

@dhiltgen It works great, just when I set the LD_LIBRARY_PATH it doesn't seem to use host libs:

image

Unless I did something wrong when compiling? I did a standard compile with no special options

@remy415
Copy link
Contributor

remy415 commented May 3, 2024

or did you mean from the binary? I forgot to check that. I installed the latest binary and ran it with the ld_lib_path, same result as above: it worked just fine and seemed to use bundled libs

@dhiltgen
Copy link
Collaborator

dhiltgen commented May 4, 2024

That's great news that the official binary is working without having to set LD_LIBRARY_PATH. Ultimately the goal is not to require any special settings if possible, so it sounds like we've achieved that.

I'm going to go ahead and close this ticket. Let me know if you think there's any lingering glitches we need to re-open it for.

@dhiltgen dhiltgen closed this as completed May 4, 2024
@remy415
Copy link
Contributor

remy415 commented May 4, 2024

@dhiltgen according to the logs, it did a GPU library search, found the bundled library and the libraries in the "default" directories, then selected the bundled library. I ran lsof and it does have the libraries at the /usr/local/cuda/lib64 directory open, but it all works so I think it's good as far as I can tell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nvidia Issues relating to Nvidia GPUs and CUDA
Projects
None yet
Development

No branches or pull requests

4 participants