Description
AMD ROCm support on WSL requires a stub library that redirects ROCm clients to use the special WSL mechanisms rather than the amdgpu
kernel driver. This is provided as a package called hsa-runtime-rocr4wsl-amdgpu
. NixOS-WSL should provide this package to allow more complete integration with AMD GPUs.
Problem
Without this library, ROCm clients like rocminfo
will fail to start:
$ rocminfo
WSL environment detected.
hsa api call failure at: /build/source/rocminfo.cc:1282
Call returned HSA_STATUS_ERROR_OUT_OF_RESOURCES: The runtime failed to allocate the necessary resources. This error may also occur when the core runtime library needs to spawn threads or create internal OS-specific events.
Solution
By getting the library from another distribution's package, patching it with patchelf
, and loading it through LD_LIBRARY_PATH
alongside /run/opengl-driver/lib
:
environment.sessionVariables =
let
hsa-runtime-rocr4wsl-amdgpu = pkgs.stdenv.mkDerivation rec {
name = "hsa-runtime-rocr4wsl-amdgpu";
version = "25.10-2149029.24.04";
src = pkgs.fetchzip {
url = "https://repo.radeon.com/amdgpu/6.4/ubuntu/pool/main/h/{$name}/${name}_${version}_amd64.deb";
hash = "sha256-DNvxgXJ8ln66cGC5DxMGnDw3gFz50srPalxpo9oLPKg=";
nativeBuildInputs = [ pkgs.dpkg ];
};
installPhase = ''
runHook preInstall
mkdir -p $out/lib
# TODO I'm guessing I should care about the specific ROCm version here...
mv opt/rocm-*/lib/* $out/lib
runHook postInstall
'';
buildInputs = [
pkgs.stdenv.cc.cc.lib
pkgs.elfutils
];
nativeBuildInputs = [ pkgs.autoPatchelfHook ];
# this needs to get dynamically loaded in with another LD_LIBRARY_PATH
autoPatchelfIgnoreMissingDeps = [ "libdxcore.so" ];
};
in
{
LD_LIBRARY_PATH = "/run/opengl-driver/lib:${hsa-runtime-rocr4wsl-amdgpu}/lib";
};
ROCm is now able to enumerate successfully:
$ rocminfo
WSL environment detected.
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
Runtime Ext Version: 1.6
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
DMAbuf Support: YES
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen 7 5800X 8-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 7 5800X 8-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Internal Node ID: 0
Compute Unit: 16
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
Memory Properties:
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 32821572(0x1f4d144) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 32821572(0x1f4d144) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 32821572(0x1f4d144) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 4
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 32821572(0x1f4d144) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx1100
Marketing Name: AMD Radeon RX 7900 XTX
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 6144(0x1800) KB
L3: 98304(0x18000) KB
Chip ID: 29772(0x744c)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2371
Internal Node ID: 1
Compute Unit: 96
SIMDs per CU: 2
Shader Engines: 6
Shader Arrs. per Eng.: 2
Coherent Host Access: FALSE
Memory Properties:
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 372
SDMA engine uCode:: 24
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 25078440(0x17eaaa8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 25078440(0x17eaaa8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1100
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx11-generic
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
Note that this relies on a recent-ish version of ROCm (6.1.3+, I think), so this is the oldest nixpkgs commit that will work: NixOS/nixpkgs@4d2c7ad
I have not yet tested this further than getting successful device enumeration.