Skip to content

SYCL USM program hangs on exit in vmbus_teardown_gpadl on WSL2 with Lunar Lake Arc 140V #21991

@linsalrob

Description

@linsalrob

Describe the bug

A minimal SYCL program using Intel GPU offload under WSL2 hangs during cleanup/exit on an Intel Lunar Lake integrated GPU.

The system can see the GPU via sycl-ls, and a simple SYCL kernel runs correctly. However, the process does not terminate cleanly. Depending on the backend selected:

With ONEAPI_DEVICE_SELECTOR=opencl:gpu, the program prints normal end, but the process remains stuck in uninterruptible sleep with WCHAN=vmbus_teardown_gpadl.
With ONEAPI_DEVICE_SELECTOR=level_zero:gpu, the program prints the device name and kernel result, but does not reach normal end; it appears to hang during or immediately around sycl::free() / cleanup.

This same failure mode also affects downstream users such as PyTorch XPU. A minimal PyTorch XPU allocation succeeds but the Python process hangs on exit in vmbus_teardown_gpadl. However, the SYCL reproducer below demonstrates that the problem occurs below PyTorch.

To reproduce

Minimal reproducer:

#include <sycl/sycl.hpp>
#include <iostream>

int main() {
    sycl::queue q{sycl::gpu_selector_v};

    std::cout << "device: "
              << q.get_device().get_info<sycl::info::device::name>()
              << std::endl;

    int *x = sycl::malloc_shared<int>(1, q);
    x[0] = 0;

    q.submit([&](sycl::handler& h) {
        h.single_task([=]() { x[0] = 1; });
    }).wait();

    std::cout << "x: " << x[0] << std::endl;

    sycl::free(x, q);

    std::cout << "normal end" << std::endl;
    return 0;
}

Save as

sycl_min.cpp

Compile with

source /opt/intel/oneapi/setvars.sh
icpx -fsycl sycl_min.cpp -o sycl_min

Launch with OpenCL GPU backend

ONEAPI_DEVICE_SELECTOR=opencl:gpu ./sycl_min

Observed output:

device: Intel(R) Graphics [0x64a0]
x: 1
normal end

However, the process does not exit. In another terminal:

ps -eo pid,ppid,stat,wchan:40,cmd | egrep 'sycl_min|PID'

shows:

PID    PPID STAT WCHAN                                    CMD
700     292 Dl+  vmbus_teardown_gpadl                     ./sycl_min

Launch with Level Zero GPU backend

ONEAPI_DEVICE_SELECTOR=level_zero:gpu ./sycl_min

Observed output:

device: Intel(R) Graphics [0x64a0]
x: 1

The program does not reach:

normal end

so it appears to hang before or during cleanup, likely around sycl::free(x, q) or related runtime cleanup.

What was expected

The program should print:

device: Intel(R) Graphics [0x64a0]
x: 1
normal end

and then exit normally with status 0.

What is wrong

The SYCL kernel runs correctly, but the process does not terminate cleanly. It becomes stuck in uninterruptible sleep, with WCHAN=vmbus_teardown_gpadl, and cannot be killed with kill -9. Recovery requires:

wsl --shutdown

from Windows PowerShell.

Environment

  • OS

Host:

Windows, WSL2

Guest:

Ubuntu 24.04 under WSL2
  • Target device and vendor
Intel GPU
Intel Core Ultra 7 268V
Intel Arc 140V GPU
Lunar Lake
PCI device ID: 8086:64A0
  • Windows PowerShell:
Get-CimInstance Win32_Processor | Select-Object Name
Get-CimInstance Win32_VideoController | Select-Object Name,PNPDeviceID,DriverVersion,DriverDate

Output:

Name
----
Intel(R) Core(TM) Ultra 7 268V

Name                             PNPDeviceID                                                  DriverVersion DriverDate
----                             -----------                                                  ------------- ----------
Intel(R) Arc(TM) 140V GPU (16GB) PCI\VEN_8086&DEV_64A0&SUBSYS_0CE31028&REV_04\3&11583659&0&10 32.0.101.8737 29/04/2026 9:30:00 AM
  • DPC++ version
source /opt/intel/oneapi/setvars.sh
icpx --version

On this system:

Intel(R) oneAPI DPC++/C++ Compiler 2026.0.0 (2026.0.0.20260331)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2026.0/bin/compiler
Configuration file: /opt/intel/oneapi/compiler/2026.0/bin/compiler/../icpx.cfg
  • Dependencies / device discovery
source /opt/intel/oneapi/setvars.sh
which icpx
which sycl-ls
sycl-ls

Output:

/opt/intel/oneapi/compiler/2026.0/bin/icpx
/opt/intel/oneapi/compiler/2026.0/bin/sycl-ls
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero V2, Intel(R) Graphics [0x64a0] 20.4.4 [1.15.37833+4]
[opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 7 268V OpenCL 3.0 (Build 0) [2026.21.3.0.31_160000]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Graphics [0x64a0] OpenCL 3.0 NEO  [26.14.37833.4]
  • Linux GPU runtime packages:
apt-cache policy intel-opencl-icd libze1 libze-intel-gpu1 xpu-smi libxpum1 intel-gsc intel-metrics-discovery

Relevant installed versions:

intel-opencl-icd:          26.09.37435.12-1~24.04~ppa1
libze1:                    1.28.0-1~24.04~ppa1
libze-intel-gpu1:          26.09.37435.12-1~24.04~ppa1
xpu-smi:                   1.3.6-1~24.04~ppa1
libxpum1:                  1.3.6-1~24.04~ppa1
intel-gsc:                 0.9.5-1~24.04~ppa2
intel-metrics-discovery:   1.14.183-1~24.04~ppa1

Additional context

This issue was first observed while trying to use PyTorch XPU and a downstream bioinformatics tool that uses a Transformer model on XPU. The same hang occurs with a minimal PyTorch reproducer, but the SYCL example above shows that the problem is not specific to PyTorch.

Minimal PyTorch reproducer:

import torch, gc

print("torch", torch.__version__, flush=True)
print("xpu available", torch.xpu.is_available(), flush=True)
print("device", torch.xpu.get_device_name(0), flush=True)

a = torch.ones((10, 10), device="xpu")
torch.xpu.synchronize()
print(a[0, 0].cpu(), flush=True)

del a
gc.collect()
torch.xpu.empty_cache()
torch.xpu.synchronize()

print("normal end", flush=True)

Observed with both:

torch 2.8.0+xpu
torch 2.11.0+xpu

The PyTorch script prints normal end, but the Python process remains stuck:

STAT=Dl+
WCHAN=vmbus_teardown_gpadl
CMD=python xpu_normal_exit_test.py

The process cannot be killed with kill -9; recovery requires restarting WSL:

wsl --shutdown

xpu-smi discovery reports no device on this system, but both sycl-ls and PyTorch detect the GPU correctly:

Intel(R) Graphics [0x64a0]

This may be related to WSL2 / DXG / VMBus teardown of GPU resources after SYCL USM allocation or cleanup on Lunar Lake / Arc 140V.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions