Skip to content

bug: detectVulkanSupport false positive on Linux when Vulkan loader is present but no ICD is installed — triggers unnecessary cmake rebuild LABELS: bug #600

@Zighy

Description

@Zighy

Issue description

RELATED: #554

Expected Behavior

On Linux, detectVulkanSupport (in src/bindings/utils/detectAvailableComputeLayers.ts)
checks only whether libvulkan.so or libvulkan.so.1 is present on the library path.
This file is the Vulkan loader (libvulkan1 package) — it is a transitive dependency
of Mesa, ROCm, AMDGPU-PRO, and many desktop packages. Its presence does not mean a
Vulkan ICD (GPU driver manifest) is installed and working.

detectVulkanSupport returns false on systems where the Vulkan loader is installed
but no ICD is configured. The gpu:false (or correct GPU) binary is selected directly,
with no cmake rebuild triggered.

Actual Behavior

When the function returns true on a system with the loader but no ICD:

  1. node-llama-cpp selects the linux-x64-vulkan-* prebuilt binary.
  2. Vulkan initialisation calls the loader, which finds no ICD in /etc/vulkan/icd.d/ or
    /usr/share/vulkan/icd.d/ and reports zero devices.
  3. node-llama-cpp treats this as "no compatible binary" and schedules a cmake rebuild
    (5–30 min, or longer on slow machines).
  4. If Vulkan SDK headers are absent, the rebuild fails too — adding another wasted pass
    before finally falling back to the gpu:false binary that was available all along.

detectVulkanSupport returns true because libvulkan.so.1 is present. The Vulkan
binary is selected, fails at runtime (no ICD), and a cmake rebuild is triggered on
every cold start.

Steps to reproduce

On any system where libvulkan1 is installed but no Vulkan ICD is present (e.g. a
ROCm Docker container, or a headless server with Mesa-dev packages):

# Confirm loader present, ICD absent
dpkg -l libvulkan1            # → installed
ls /etc/vulkan/icd.d/         # → No such file or directory
ls /usr/share/vulkan/icd.d/   # → No such file or directory

# Start any node-llama-cpp workload (e.g. getLlama())
# → observe "Building llama.cpp from source" in logs despite a valid binary existing
# → cmake build takes minutes and may fail

Confirmed on rocm/dev-ubuntu-24.04:latest (Ubuntu 24.04.4, libvulkan1
1.3.275.0-1build1 installed as ROCm dependency, no Vulkan ICD):

$ find /usr/lib -name "libvulkan.so.1"
/usr/lib/x86_64-linux-gnu/libvulkan.so.1   ← loader present

$ ls /etc/vulkan/icd.d/ 2>/dev/null || echo "no ICD"
no ICD                                      ← no driver

My Environment

node-llama-cpp 3.18.1
Node.js 22.22.2
OS Ubuntu 24.04.4 LTS (Docker, rocm/dev-ubuntu-24.04:latest)
libvulkan1 1.3.275.0-1build1 (installed as ROCm transitive dep)
Vulkan ICD none
GPU AMD RX 9060 XT — gfx1200 (RDNA 4)
ROCm 7.2.2

Additional Context

Proposed fix — require at least one ICD JSON file in addition to the loader.
A working Vulkan installation always has both; systems that only have the loader return
false and skip the cmake detour entirely.

// src/bindings/utils/detectAvailableComputeLayers.ts — Linux branch
else if (platform === "linux") {
    const loaderExists = await asyncSome([
        hasFileInPath("libvulkan.so", librarySearchPaths),
        hasFileInPath("libvulkan.so.1", librarySearchPaths)
    ]);
    if (!loaderExists) return false;

    // Also require at least one ICD manifest. Without an ICD the loader
    // finds no device → binary fails → cmake rebuild triggered unnecessarily.
    const icdDirs = [
        "/etc/vulkan/icd.d",
        "/usr/share/vulkan/icd.d",
        process.env.VK_ICD_FILENAMES ?? null,   // explicit override
        process.env.VK_DRIVER_FILES  ?? null,   // alias since Vulkan 1.3.207
    ].filter((d): d is string => d != null);

    for (const dir of icdDirs) {
        try {
            const files = await fs.readdir(dir);
            if (files.some(f => f.endsWith(".json"))) return true;
        } catch { /* dir absent — skip */ }
    }
    return false;
}

ICD locations reference:
Vulkan Loader Interface Architecture

Relevant Features Used

  • Metal support
  • CUDA support
  • Vulkan support
  • Grammar
  • Function calling

Are you willing to resolve this issue by submitting a Pull Request?

No, I don’t have the time and I’m okay to wait for the community / maintainers to resolve this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingrequires triageRequires triaging

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions