Report more information about GPUs in verbose mode #2162

dhiltgen · 2024-01-23T19:43:51Z

This adds additional calls to both CUDA and ROCm management libraries to discover additional attributes about the GPU(s) detected in the system, and wires up runtime verbosity selection. When users hit problems with GPUs we can ask them to run with OLLAMA_DEBUG=1 ollama serve and share the server log.

Example output on a CUDA laptop:

% OLLAMA_DEBUG=1 ./ollama-linux-amd64 serve
...
time=2024-01-23T11:31:22.828-08:00 level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:256 msg="Discovered GPU libraries: [/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.545.23.08]"
CUDA driver version: 545.23.08
time=2024-01-23T11:31:22.859-08:00 level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:96 msg="Nvidia GPU detected"
[0] CUDA device name: NVIDIA GeForce GTX 1650 with Max-Q Design
[0] CUDA part number:
nvmlDeviceGetSerial failed: 3
[0] CUDA vbios version: 90.17.31.00.26
[0] CUDA brand: 5
[0] CUDA totalMem 4294967296
[0] CUDA usedMem 3789357056
time=2024-01-23T11:31:22.865-08:00 level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:137 msg="CUDA Compute Capability detected: 7.5"

Example output on a ROCM GPU system

% OLLAMA_DEBUG=1 ./ollama-linux-amd64 serve
...
time=2024-01-23T19:24:55.162Z level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:256 msg="Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.6.0.60000 /opt/rocm-6.0.0/lib/librocm_smi64.so.6.0.60000]"
time=2024-01-23T19:24:55.163Z level=INFO source=/go/src/github.com/jmorganca/ollama/gpu/gpu.go:106 msg="Radeon GPU detected"
discovered 1 ROCm GPU Devices
[0] ROCm device name: Navi 31 [Radeon RX 7900 XT/7900 XTX]
[0] ROCm GPU brand: Navi 31 [Radeon RX 7900 XT/7900 XTX]
[0] ROCm GPU vendor: Advanced Micro Devices, Inc. [AMD/ATI]
[0] ROCm GPU VRAM vendor: samsung
[0] ROCm GPU S/N: 43cfeecf3446fbf7
[0] ROCm GPU subsystem name: NITRO+ RX 7900 XTX Vapor-X
[0] ROCm GPU vbios version: 113-4E4710U-T4Y
[0] ROCm totalMem 25753026560
[0] ROCm usedMem 27852800

This also implements the TODO on ROCm to handle multiple GPUs reported by the management library.

This adds additional calls to both CUDA and ROCm management libraries to discover additional attributes about the GPU(s) detected in the system, and wires up runtime verbosity selection. When users hit problems with GPUs we can ask them to run with `OLLAMA_DEBUG=1 ollama serve` and share the results.

dhiltgen · 2024-01-23T20:27:50Z

Note: we could consider combining this with #2163 and bubble up the GPU count and perhaps model name.

dhiltgen mentioned this pull request Jan 24, 2024

AMD GPU & ROCm support #738

Closed

jmorganca approved these changes Jan 24, 2024

View reviewed changes

dhiltgen merged commit f63dc2d into ollama:main Jan 24, 2024
10 checks passed

dhiltgen deleted the rocm_real_gpus branch January 24, 2024 01:45

dhiltgen mentioned this pull request Jan 24, 2024

ROCm v5 crash - free(): invalid pointer #2165

Closed

Eelviny mentioned this pull request Jan 24, 2024

ROCm container CUDA error #2166

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report more information about GPUs in verbose mode #2162

Report more information about GPUs in verbose mode #2162

dhiltgen commented Jan 23, 2024 •

edited

Loading

dhiltgen commented Jan 23, 2024

Report more information about GPUs in verbose mode #2162

Report more information about GPUs in verbose mode #2162

Conversation

dhiltgen commented Jan 23, 2024 • edited Loading

dhiltgen commented Jan 23, 2024

dhiltgen commented Jan 23, 2024 •

edited

Loading