-
Notifications
You must be signed in to change notification settings - Fork 223
Open
Labels
bugSomething isn't workingSomething isn't workingcuda.coreEverything related to the cuda.core moduleEverything related to the cuda.core moduletriageNeeds the team's attentionNeeds the team's attention
Description
Tracking the failure below.
xref: #1242 (comment)
All details are in the full logs:
qa_bindings_windows_2025-11-18+102913_build_log.txt
qa_bindings_windows_2025-11-18+102913_tests_log.txt
The only non-obvious detail:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0 was installed from cuda_13.0.1_windows.exe
EDIT: The exact same error appeared when retesting with v13.0 installed from cuda_13.0.2_windows.exe
C:\Users\rgrossekunst\forked\cuda-python>nvidia-smi
Tue Nov 18 10:31:56 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.34 Driver Version: 591.34 CUDA Version: 13.1 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX A6000 WDDM | 00000000:C1:00.0 Off | Off |
| 30% 31C P8 19W / 300W | 1778MiB / 49140MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
================================== FAILURES ===================================
___________________ test_vmm_allocator_policy_configuration ___________________
def test_vmm_allocator_policy_configuration():
"""Test VMM allocator with different policy configurations.
This test verifies that VirtualMemoryResource can be configured
with different allocation policies and that the configuration affects
the allocation behavior.
"""
device = Device()
device.set_current()
# Skip if virtual memory management is not supported
if not device.properties.virtual_memory_management_supported:
pytest.skip("Virtual memory management is not supported on this device")
# Skip if GPU Direct RDMA is supported (we want to test the unsupported case)
if not device.properties.gpu_direct_rdma_supported:
pytest.skip("This test requires a device that doesn't support GPU Direct RDMA")
# Test with custom VMM config
custom_config = VirtualMemoryResourceOptions(
allocation_type="pinned",
location_type="device",
granularity="minimum",
gpu_direct_rdma=True,
handle_type="posix_fd" if not IS_WINDOWS else "win32_kmt",
peers=(),
self_access="rw",
peer_access="rw",
)
vmm_mr = VirtualMemoryResource(device, config=custom_config)
# Verify configuration is applied
assert vmm_mr.config == custom_config
assert vmm_mr.config.gpu_direct_rdma is True
assert vmm_mr.config.granularity == "minimum"
# Test allocation with custom config
buffer = vmm_mr.allocate(8192)
assert buffer.size >= 8192
assert buffer.device_id == device.device_id
# Test policy modification
new_config = VirtualMemoryResourceOptions(
allocation_type="pinned",
location_type="device",
granularity="recommended",
gpu_direct_rdma=False,
handle_type="posix_fd" if not IS_WINDOWS else "win32_kmt",
peers=(),
self_access="r", # Read-only access
peer_access="r",
)
# Modify allocation policy
> modified_buffer = vmm_mr.modify_allocation(buffer, 16384, config=new_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests\test_memory.py:440:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cuda\core\experimental\_memory\_virtual_memory_resource.py:230: in modify_allocation
raise_if_driver_error(res)
cuda\core\experimental\_utils\cuda_utils.pyx:67: in cuda.core.experimental._utils.cuda_utils._check_driver_error
cpdef inline int _check_driver_error(cydriver.CUresult error) except?-1 nogil:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> raise CUDAError(f"{name.decode()}: {expl}")
E cuda.core.experimental._utils.cuda_utils.CUDAError: CUDA_ERROR_UNKNOWN: This indicates that an unknown internal error has occurred.
cuda\core\experimental\_utils\cuda_utils.pyx:78: CUDAError
=========================== short test summary info ===========================
SKIPPED [6] tests\example_tests\utils.py:37: cupy not installed, skipping related tests
SKIPPED [1] tests\example_tests\utils.py:37: torch not installed, skipping related tests
SKIPPED [1] tests\example_tests\utils.py:43: skip C:\Users\rgrossekunst\forked\cuda-python\cuda_core\tests\example_tests\..\..\examples\thread_block_cluster.py
SKIPPED [5] tests\memory_ipc\test_errors.py:20: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_event_ipc.py:20: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_event_ipc.py:91: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_event_ipc.py:106: Device does not support IPC
SKIPPED [8] tests\memory_ipc\test_event_ipc.py:123: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_leaks.py:26: mempool allocation handle is not using fds or psutil is unavailable
SKIPPED [12] tests\memory_ipc\test_leaks.py:82: mempool allocation handle is not using fds or psutil is unavailable
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:16: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:53: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:103: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:153: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_send_buffers.py:18: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:24: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:79: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:125: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:29: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:65: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:109: Device does not support IPC
SKIPPED [1] tests\test_device.py:327: Test requires at least 2 CUDA devices
SKIPPED [1] tests\test_device.py:375: Test requires at least 2 CUDA devices
SKIPPED [1] tests\test_launcher.py:92: Driver or GPU not new enough for thread block clusters
SKIPPED [1] tests\test_launcher.py:122: Driver or GPU not new enough for thread block clusters
SKIPPED [2] tests\test_launcher.py:274: cupy not installed
SKIPPED [1] tests\test_linker.py:113: nvjitlink requires lto for ptx linking
SKIPPED [1] tests\test_memory.py:514: This test requires a device that doesn't support GPU Direct RDMA
SKIPPED [1] tests\test_memory.py:645: Driver rejects IPC-enabled mempool creation on this platform
SKIPPED [7] tests\test_module.py:345: Test requires numba to be installed
SKIPPED [2] tests\test_module.py:389: Device with compute capability 90 or higher is required for cluster support
SKIPPED [1] tests\test_module.py:404: Device with compute capability 90 or higher is required for cluster support
SKIPPED [2] tests\test_utils.py: got empty parameter set for (in_arr, use_stream)
SKIPPED [1] tests\test_utils.py: CuPy is not installed
FAILED tests/test_memory.py::test_vmm_allocator_policy_configuration - cuda.core.experimental._utils.cuda_utils.CUDAError: CUDA_ERROR_UNKNOWN: This indicates that an unknown internal error has occurred.
============ 1 failed, 518 passed, 75 skipped in 68.77s (0:01:08) =============
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcuda.coreEverything related to the cuda.core moduleEverything related to the cuda.core moduletriageNeeds the team's attentionNeeds the team's attention