Skip to content

test_vmm_allocator_policy_configuration failure: Windows / A6000 / WDDM #1264

@rwgk

Description

@rwgk

Tracking the failure below.

xref: #1242 (comment)

All details are in the full logs:

qa_bindings_windows_2025-11-18+102913_build_log.txt

qa_bindings_windows_2025-11-18+102913_tests_log.txt

The only non-obvious detail:

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0 was installed from cuda_13.0.1_windows.exe

EDIT: The exact same error appeared when retesting with v13.0 installed from cuda_13.0.2_windows.exe

C:\Users\rgrossekunst\forked\cuda-python>nvidia-smi
Tue Nov 18 10:31:56 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.34                 Driver Version: 591.34         CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A6000             WDDM  |   00000000:C1:00.0 Off |                  Off |
| 30%   31C    P8             19W /  300W |    1778MiB /  49140MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
================================== FAILURES ===================================
___________________ test_vmm_allocator_policy_configuration ___________________

    def test_vmm_allocator_policy_configuration():
        """Test VMM allocator with different policy configurations.
    
        This test verifies that VirtualMemoryResource can be configured
        with different allocation policies and that the configuration affects
        the allocation behavior.
        """
        device = Device()
        device.set_current()
    
        # Skip if virtual memory management is not supported
        if not device.properties.virtual_memory_management_supported:
            pytest.skip("Virtual memory management is not supported on this device")
    
        # Skip if GPU Direct RDMA is supported (we want to test the unsupported case)
        if not device.properties.gpu_direct_rdma_supported:
            pytest.skip("This test requires a device that doesn't support GPU Direct RDMA")
    
        # Test with custom VMM config
        custom_config = VirtualMemoryResourceOptions(
            allocation_type="pinned",
            location_type="device",
            granularity="minimum",
            gpu_direct_rdma=True,
            handle_type="posix_fd" if not IS_WINDOWS else "win32_kmt",
            peers=(),
            self_access="rw",
            peer_access="rw",
        )
    
        vmm_mr = VirtualMemoryResource(device, config=custom_config)
    
        # Verify configuration is applied
        assert vmm_mr.config == custom_config
        assert vmm_mr.config.gpu_direct_rdma is True
        assert vmm_mr.config.granularity == "minimum"
    
        # Test allocation with custom config
        buffer = vmm_mr.allocate(8192)
        assert buffer.size >= 8192
        assert buffer.device_id == device.device_id
    
        # Test policy modification
        new_config = VirtualMemoryResourceOptions(
            allocation_type="pinned",
            location_type="device",
            granularity="recommended",
            gpu_direct_rdma=False,
            handle_type="posix_fd" if not IS_WINDOWS else "win32_kmt",
            peers=(),
            self_access="r",  # Read-only access
            peer_access="r",
        )
    
        # Modify allocation policy
>       modified_buffer = vmm_mr.modify_allocation(buffer, 16384, config=new_config)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests\test_memory.py:440: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cuda\core\experimental\_memory\_virtual_memory_resource.py:230: in modify_allocation
    raise_if_driver_error(res)
cuda\core\experimental\_utils\cuda_utils.pyx:67: in cuda.core.experimental._utils.cuda_utils._check_driver_error
    cpdef inline int _check_driver_error(cydriver.CUresult error) except?-1 nogil:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   raise CUDAError(f"{name.decode()}: {expl}")
E   cuda.core.experimental._utils.cuda_utils.CUDAError: CUDA_ERROR_UNKNOWN: This indicates that an unknown internal error has occurred.

cuda\core\experimental\_utils\cuda_utils.pyx:78: CUDAError
=========================== short test summary info ===========================
SKIPPED [6] tests\example_tests\utils.py:37: cupy not installed, skipping related tests
SKIPPED [1] tests\example_tests\utils.py:37: torch not installed, skipping related tests
SKIPPED [1] tests\example_tests\utils.py:43: skip C:\Users\rgrossekunst\forked\cuda-python\cuda_core\tests\example_tests\..\..\examples\thread_block_cluster.py
SKIPPED [5] tests\memory_ipc\test_errors.py:20: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_event_ipc.py:20: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_event_ipc.py:91: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_event_ipc.py:106: Device does not support IPC
SKIPPED [8] tests\memory_ipc\test_event_ipc.py:123: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_leaks.py:26: mempool allocation handle is not using fds or psutil is unavailable
SKIPPED [12] tests\memory_ipc\test_leaks.py:82: mempool allocation handle is not using fds or psutil is unavailable
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:16: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:53: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:103: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_memory_ipc.py:153: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_send_buffers.py:18: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:24: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:79: Device does not support IPC
SKIPPED [1] tests\memory_ipc\test_serialize.py:125: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:29: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:65: Device does not support IPC
SKIPPED [2] tests\memory_ipc\test_workerpool.py:109: Device does not support IPC
SKIPPED [1] tests\test_device.py:327: Test requires at least 2 CUDA devices
SKIPPED [1] tests\test_device.py:375: Test requires at least 2 CUDA devices
SKIPPED [1] tests\test_launcher.py:92: Driver or GPU not new enough for thread block clusters
SKIPPED [1] tests\test_launcher.py:122: Driver or GPU not new enough for thread block clusters
SKIPPED [2] tests\test_launcher.py:274: cupy not installed
SKIPPED [1] tests\test_linker.py:113: nvjitlink requires lto for ptx linking
SKIPPED [1] tests\test_memory.py:514: This test requires a device that doesn't support GPU Direct RDMA
SKIPPED [1] tests\test_memory.py:645: Driver rejects IPC-enabled mempool creation on this platform
SKIPPED [7] tests\test_module.py:345: Test requires numba to be installed
SKIPPED [2] tests\test_module.py:389: Device with compute capability 90 or higher is required for cluster support
SKIPPED [1] tests\test_module.py:404: Device with compute capability 90 or higher is required for cluster support
SKIPPED [2] tests\test_utils.py: got empty parameter set for (in_arr, use_stream)
SKIPPED [1] tests\test_utils.py: CuPy is not installed
FAILED tests/test_memory.py::test_vmm_allocator_policy_configuration - cuda.core.experimental._utils.cuda_utils.CUDAError: CUDA_ERROR_UNKNOWN: This indicates that an unknown internal error has occurred.
============ 1 failed, 518 passed, 75 skipped in 68.77s (0:01:08) =============

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcuda.coreEverything related to the cuda.core moduletriageNeeds the team's attention

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions