-
Notifications
You must be signed in to change notification settings - Fork 15
Closed
Description
Summary
In sumpy.tools.run_opencl_fft, the NVIDIA CUDA queue validation condition appears inverted.
Current code on main:
for evt in wait_for:
if not evt.command_queue != queue:
raise RuntimeError("Different queues not supported with NVIDIA CUDA")not (evt.command_queue != queue) is equivalent to evt.command_queue == queue, so this raises for the same queue (the safe/expected case), and does not raise for different queues (the unsafe case).
Expected behavior
Raise only when an event in wait_for belongs to a different queue:
if evt.command_queue != queue:
raise RuntimeError("Different queues not supported with NVIDIA CUDA")Why this matters
On NVIDIA CUDA OpenCL, valid FFT calls with same-queue wait_for events fail with:
RuntimeError: Different queues not supported with NVIDIA CUDA
This also breaks downstream CUDA tests (example observed in volumential):
test/test_volume_fmm.py::test_volume_fmm_laplacefails before the fix- passes after changing the condition to
evt.command_queue != queue
Additional context
This looks like a logic typo introduced in the NVIDIA marker handling path (Fix enqueue_marker for NVIDIA CUDA, commit 9d1d611), likely unintended.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels