Issue description
RELATED: #479 (feat: builtin ROCm support)
Expected Behavior
In src/bindings/Llama.ts, the call to loadBackends(backendsPath) is guarded by
buildGpu !== false. When a binary is built with --gpu false, buildGpu is false
and the guard makes the entire block a no-op. Any backend .so placed in the binary's
Release/ directory by a custom cmake build (e.g. libggml-hip.so via GGML_HIP=ON)
is never loaded. Inference silently falls back to CPU with no warning or error.
// src/bindings/Llama.ts — v3.18.1 (affected code)
let loadedGpu = bindings.getGpuType();
if (loadedGpu == null || (loadedGpu === false && buildGpu !== false)) {
const backendsPath = path.dirname(bindingPath);
const fallbackBackendsDir = path.join(extBackendsPath ?? backendsPath, "fallback");
bindings.loadBackends(backendsPath); // ← never reached when buildGpu === false
loadedGpu = bindings.getGpuType();
if (loadedGpu == null || (loadedGpu === false && buildGpu !== false))
bindings.loadBackends(fallbackBackendsDir);
}
loadBackends(backendsPath) is called regardless of buildGpu, so that custom GPU
backends compiled via NODE_LLAMA_CPP_CMAKE_OPTION_* are loaded and used at runtime.
If no backend initialises, getGpuType() returns false and the existing fallback
path proceeds unchanged.
Actual Behavior
In src/bindings/Llama.ts, the call to loadBackends(backendsPath) is guarded by
buildGpu !== false. When a binary is built with --gpu false, buildGpu is false
and the guard makes the entire block a no-op. Any backend .so placed in the binary's
Release/ directory by a custom cmake build (e.g. libggml-hip.so via GGML_HIP=ON)
is never loaded. Inference silently falls back to CPU with no warning or error.
// src/bindings/Llama.ts — v3.18.1 (affected code)
let loadedGpu = bindings.getGpuType();
if (loadedGpu == null || (loadedGpu === false && buildGpu !== false)) {
const backendsPath = path.dirname(bindingPath);
const fallbackBackendsDir = path.join(extBackendsPath ?? backendsPath, "fallback");
bindings.loadBackends(backendsPath); // ← never reached when buildGpu === false
loadedGpu = bindings.getGpuType();
if (loadedGpu == null || (loadedGpu === false && buildGpu !== false))
bindings.loadBackends(fallbackBackendsDir);
}
loadBackends(backendsPath) is never called when the binary was built with --gpu false.
llama.gpu is false even when a valid GPU backend (e.g. libggml-hip.so) was
compiled into the binary directory. Inference runs on CPU.
Steps to reproduce
# 1. Set cmake options to compile a custom GPU backend
export NODE_LLAMA_CPP_CMAKE_OPTION_GGML_HIP=ON
export NODE_LLAMA_CPP_CMAKE_OPTION_AMDGPU_TARGETS=gfx1200
# 2. Build with --gpu false
node node-llama-cpp/dist/cli/cli.js source download --gpu false --noUsageExample
# 3. Confirm the backend .so was compiled
find ~/.cache/node-llama-cpp -name "libggml-hip.so"
# → file exists in Release/
# 4. Check llama.gpu at runtime
node -e "
const { getLlama } = require('node-llama-cpp');
getLlama({ gpu: false }).then(l => console.log('gpu:', l.gpu));
"
# Expected: gpu: cuda (ROCm maps its device names to "cuda" internally)
# Actual: gpu: false (libggml-hip.so was never loaded)
My Environment
|
|
| node-llama-cpp |
3.18.1 |
| llama.cpp release |
b8390 |
| Node.js |
22.22.2 |
| OS |
Ubuntu 24.04.4 LTS (Docker, rocm/dev-ubuntu-24.04:latest) |
| GPU |
AMD RX 9060 XT — gfx1200 (RDNA 4) |
| ROCm |
7.2.2 |
Additional Context
The buildGpu !== false guard is redundant: loadBackends(backendsPath) already has
no effect if no backend is found — getGpuType() simply returns false again and the
fallback path proceeds. The guard only prevents the probe from being attempted.
Proposed fix — remove buildGpu !== false from both checks:
let loadedGpu = bindings.getGpuType();
if (loadedGpu == null || loadedGpu === false) {
const backendsPath = path.dirname(bindingPath);
const fallbackBackendsDir = path.join(extBackendsPath ?? backendsPath, "fallback");
bindings.loadBackends(backendsPath);
loadedGpu = bindings.getGpuType();
if (loadedGpu == null || loadedGpu === false)
bindings.loadBackends(fallbackBackendsDir);
}
This fix is a prerequisite for any --gpu false + cmake workaround for ROCm/HIP while
native support is pending (#479). It also affects any other custom GPU backend injected
via NODE_LLAMA_CPP_CMAKE_OPTION_* on non-NVIDIA/non-Apple hardware.
Relevant Features Used
Are you willing to resolve this issue by submitting a Pull Request?
No, I don’t have the time and I’m okay to wait for the community / maintainers to resolve this issue.
Issue description
RELATED: #479 (feat: builtin ROCm support)
Expected Behavior
In
src/bindings/Llama.ts, the call toloadBackends(backendsPath)is guarded bybuildGpu !== false. When a binary is built with--gpu false,buildGpuisfalseand the guard makes the entire block a no-op. Any backend
.soplaced in the binary'sRelease/directory by a custom cmake build (e.g.libggml-hip.soviaGGML_HIP=ON)is never loaded. Inference silently falls back to CPU with no warning or error.
loadBackends(backendsPath)is called regardless ofbuildGpu, so that custom GPUbackends compiled via
NODE_LLAMA_CPP_CMAKE_OPTION_*are loaded and used at runtime.If no backend initialises,
getGpuType()returnsfalseand the existing fallbackpath proceeds unchanged.
Actual Behavior
In
src/bindings/Llama.ts, the call toloadBackends(backendsPath)is guarded bybuildGpu !== false. When a binary is built with--gpu false,buildGpuisfalseand the guard makes the entire block a no-op. Any backend
.soplaced in the binary'sRelease/directory by a custom cmake build (e.g.libggml-hip.soviaGGML_HIP=ON)is never loaded. Inference silently falls back to CPU with no warning or error.
loadBackends(backendsPath)is never called when the binary was built with--gpu false.llama.gpuisfalseeven when a valid GPU backend (e.g.libggml-hip.so) wascompiled into the binary directory. Inference runs on CPU.
Steps to reproduce
My Environment
rocm/dev-ubuntu-24.04:latest)Additional Context
The
buildGpu !== falseguard is redundant:loadBackends(backendsPath)already hasno effect if no backend is found —
getGpuType()simply returnsfalseagain and thefallback path proceeds. The guard only prevents the probe from being attempted.
Proposed fix — remove
buildGpu !== falsefrom both checks:This fix is a prerequisite for any
--gpu false+ cmake workaround for ROCm/HIP whilenative support is pending (#479). It also affects any other custom GPU backend injected
via
NODE_LLAMA_CPP_CMAKE_OPTION_*on non-NVIDIA/non-Apple hardware.Relevant Features Used
Are you willing to resolve this issue by submitting a Pull Request?
No, I don’t have the time and I’m okay to wait for the community / maintainers to resolve this issue.