Describe the bug
A mix of several in-order queues and ext_oneapi_submit_barrier can trigger ur_die: urEventWait must not be called for an internal event from a valid code.
To reproduce
$ clang++ -fsycl -g sycl_event_barrier_shenanigans.cpp && ONEAPI_DEVICE_SELECTOR=level_zero:0 ./a.out
Device: Intel(R) Arc(TM) A770 Graphics
Driver: 1.14.37020+3
ur_die: urEventWait must not be called for an internal event
terminate called without an active exception
Aborted (core dumped)
UR_L0_SERIALIZE=2 makes the code pass; UR_L0_DISABLE_EVENTS_CACHING=1 makes the code segfault.
sycl_event_barrier_shenanigans.cpp
#include <sycl/sycl.hpp>
#include <optional>
#include <cstdio>
int main() {
sycl::device dev;
printf("Device: %s\n", dev.get_info<sycl::info::device::name>().c_str());
printf("Driver: %s\n", dev.get_info<sycl::info::device::driver_version>().c_str());
sycl::context ctx(dev);
sycl::property_list props{sycl::property::queue::in_order()};
sycl::queue computeQ(ctx, dev, props);
sycl::queue auxQ(ctx, dev, props);
sycl::queue drainQ(ctx, dev, props);
constexpr int N = 64;
int *d_buf = sycl::malloc_device<int>(N, dev, ctx);
std::optional<sycl::event> stale_event;
for (int iter = 0; iter < 2000; iter++) {
if (stale_event) {
stale_event->wait();
stale_event.reset();
}
computeQ.memset(d_buf, 0, N * sizeof(int));
sycl::event auxEv = auxQ.memset(d_buf, 0, sizeof(int));
sycl::event barrier = computeQ.ext_oneapi_submit_barrier({auxEv});
sycl::event opt1 = computeQ.ext_oneapi_submit_barrier();
sycl::event opt2 = computeQ.ext_oneapi_submit_barrier();
sycl::event opt3 = computeQ.ext_oneapi_submit_barrier();
barrier.wait();
stale_event = std::move(opt3);
{ [[maybe_unused]] auto _ = std::move(barrier); }
{ [[maybe_unused]] auto _ = std::move(opt1); }
{ [[maybe_unused]] auto _ = std::move(opt2); }
{ [[maybe_unused]] auto _ = std::move(auxEv); }
for (int i = 0; i < 4; i++) {
sycl::event dep = drainQ.memset(d_buf, 0, sizeof(int));
sycl::event flush = computeQ.ext_oneapi_submit_barrier({dep});
flush.wait();
}
}
if (stale_event) {
stale_event->wait();
stale_event.reset();
}
sycl::free(d_buf, ctx);
printf("PASSED (no crash)\n");
return 0;
}
Environment
- OS: Ubuntu 24.04
- Target device and vendor: Arc A770
- DPC++ version: a826f6a
- Dependencies version:
[level_zero:gpu] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.14.37020+3]
Additional context
The bug originally found in GROMACS which uses this pattern heavily.
Describe the bug
A mix of several in-order queues and
ext_oneapi_submit_barriercan triggerur_die: urEventWait must not be called for an internal eventfrom a valid code.To reproduce
UR_L0_SERIALIZE=2makes the code pass;UR_L0_DISABLE_EVENTS_CACHING=1makes the code segfault.sycl_event_barrier_shenanigans.cpp
Environment
[level_zero:gpu] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.14.37020+3]Additional context
The bug originally found in GROMACS which uses this pattern heavily.