Conversation
| Sum.combine(In[NDIt.get_global_linear_id()]); | ||
| }); | ||
| }); | ||
| }).wait_and_throw(); |
There was a problem hiding this comment.
The test is incorrect as buffer destructor is not blocking (because of no-writeback) and there's no q.wait(). In its current state, the test is flakily segfaulting because of race between kernel execution and application termination. Adding q.wait_and_throw() fixes the race.
After fixing the race, the test fails deterministically because of L0 leak check due to same reason as in #20852
| // https://github.com/intel/llvm/issues/21520 | ||
| // RUN-IF: level_zero && linux, env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 UR_L0_V2_FORCE_BATCHED=1 %{run} %t2.out |
There was a problem hiding this comment.
Disabling this batched submission test on Windows because there seems to be a race between the batch being enqueued and L0 loader teardown. See #21520
There was a problem hiding this comment.
Pull request overview
This PR fixes a deadlock on Windows caused by the Level Zero adapter eagerly initializing GlobalAdapter during DLL loading (DLLMain), which could spawn threads—an illegal operation within DLLMain. The fix makes Windows use the same lazy initialization path that Linux already uses.
Changes:
- Remove the
#ifdef _WIN32eager allocation ofGlobalAdapterinadapter.cpp, unifying initialization to the existing lazy path inurAdapterGet. - Broaden test
UNSUPPORTEDmarkers and addwait_and_throw()inreduction_resource_leak_dw.cppto address Windows test failures. - Convert a
%ifconditional RUN line toRUN-IFwith alinuxrestriction inbuffer.cppto skip a force-batched submission test on Windows.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| unified-runtime/source/adapters/level_zero/adapter.cpp | Remove Windows-specific eager GlobalAdapter allocation, unify to lazy init |
| sycl/test-e2e/Regression/reduction_resource_leak_dw.cpp | Broaden UNSUPPORTED to all Windows (not just v2 adapter) and add wait_and_throw() |
| sycl/test-e2e/Basic/buffer/buffer.cpp | Restrict force-batched submission test to Linux using RUN-IF syntax |
There was a problem hiding this comment.
Pull request overview
This PR fixes a deadlock on Windows caused by eager initialization of GlobalAdapter during DLL loading. The ur_adapter_handle_t_ constructor calls zeInit, which may spawn threads—something that is illegal from DLLMain. The fix makes GlobalAdapter initialization lazy on Windows, matching the existing Linux behavior.
Changes:
- Removed the Windows-specific eager
GlobalAdapterinitialization inadapter.cpp, relying on the existing lazy initialization inurAdapterGet. - Adjusted test expectations: broadened the
UNSUPPORTEDmarker forreduction_resource_leak_dw.cppon Windows, added.wait_and_throw()to avoid resource leaks. - Changed
buffer.cppto restrict a force-batched submission test to Linux only usingRUN-IF, with a TODO tracking Windows enablement.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| unified-runtime/source/adapters/level_zero/adapter.cpp | Removes Windows-specific eager GlobalAdapter allocation, unifying with lazy init path |
| sycl/test-e2e/Regression/reduction_resource_leak_dw.cpp | Broadens UNSUPPORTED to all Windows (not just v2 adapter) and adds wait_and_throw() |
| sycl/test-e2e/Basic/buffer/buffer.cpp | Restricts force-batched submission test to Linux via RUN-IF |
Problem
ur_win_proxy_loaderloads L0 adapter DLL in itsDLLMain. Global variables are initialized during DLL loading and so isGlobalAdapter- which tries to initialize L0 driver (zeInit).zeInitmay spawn threads and it's illegal to spawn threads directly or indirectly fromDLLMain.This is causing deadlock with some L0 driver versions.
Proposed Solution
Lazily initialize
GlobalAdapter, just like what we do on Linux.