Skip to content

Conversation

SubGlitch1
Copy link

@SubGlitch1 SubGlitch1 commented Sep 15, 2025

Add VMMAllocatedMemoryResource for Virtual Memory Management APIs

Summary

This PR implements a new VMMAllocatedMemoryResource class that provides access to CUDA's Virtual Memory Management (VMM) APIs through the cuda.core memory resource interface. This addresses the feature request for using cuMemCreate, cuMemMap, and related APIs for advanced memory management scenarios. for #967

Changes

Core Implementation

  • New VMMAllocatedMemoryResource class in cuda/core/experimental/_memory.pyx

    • Implements the MemoryResource abstract interface
    • Uses VMM APIs: cuMemCreate, cuMemAddressReserve, cuMemMap, cuMemSetAccess, cuMemUnmap, cuMemAddressFree, cuMemRelease
    • Provides proper allocation tracking and cleanup
    • Validates device VMM support during initialization
  • Device integration in cuda/core/experimental/_device.py

    • Added Device.create_vmm_memory_resource() convenience method
    • Full integration with existing memory resource infrastructure
  • Module exports in cuda/core/experimental/__init__.py

    • Added VMMAllocatedMemoryResource to public API

Testing & Examples

  • Comprehensive test suite in tests/test_vmm_memory_resource.py

    • Tests creation, allocation/deallocation, multiple allocations
    • Tests different allocation types and error conditions
    • All tests pass on VMM-capable hardware
  • Working example in examples/vmm_memory_example.py

    • Demonstrates basic and advanced usage patterns
    • Shows integration with Device and Buffer APIs

Technical Details

Memory Management Flow

  1. Allocation: cuMemCreatecuMemAddressReservecuMemMapcuMemSetAccess
  2. Tracking: Internal dictionary maintains allocation metadata for proper cleanup
  3. Deallocation: cuMemUnmapcuMemAddressFreecuMemRelease

Key Features

  • Granularity-aware: Respects CUDA allocation granularity requirements using cuMemGetAllocationGranularity
  • Error handling: Comprehensive error checking with proper cleanup on failures
  • Device validation: Automatically checks CU_DEVICE_ATTRIBUTE_VIRTUAL_MEMORY_MANAGEMENT_SUPPORTED
  • Resource tracking: Maintains allocation state for proper cleanup in destructor

API Design

# Direct usage
device = Device()
vmm_mr = device.create_vmm_memory_resource()
buffer = vmm_mr.allocate(size)

# As default memory resource
device.memory_resource = vmm_mr
buffer = device.allocate(size)  # Now uses VMM

Testing

All tests pass on VMM-capable hardware:

===================================== test session starts =====================================
tests/test_vmm_memory_resource.py::TestVMMAllocatedMemoryResource::test_vmm_memory_resource_creation PASSED
tests/test_vmm_memory_resource.py::TestVMMAllocatedMemoryResource::test_vmm_memory_resource_allocation_deallocation PASSED
tests/test_vmm_memory_resource.py::TestVMMAllocatedMemoryResource::test_vmm_memory_resource_multiple_allocations PASSED
tests/test_vmm_memory_resource.py::TestVMMAllocatedMemoryResource::test_vmm_memory_resource_with_different_allocation_types PASSED
tests/test_vmm_memory_resource.py::TestVMMAllocatedMemoryResource::test_vmm_memory_resource_invalid_device PASSED
================================ 5 passed, 1 skipped in 0.07s =================================

Files Changed

  • cuda_core/cuda/core/experimental/_memory.pyx - Core implementation
  • cuda_core/cuda/core/experimental/_device.py - Device integration
  • cuda_core/cuda/core/experimental/__init__.py - Module exports
  • cuda_core/tests/test_vmm_memory_resource.py - Test suite
  • cuda_core/examples/vmm_memory_example.py - Usage example

Copy link
Contributor

copy-pr-bot bot commented Sep 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@leofang
Copy link
Member

leofang commented Sep 15, 2025

@SubGlitch1 Thank you for your interest in contributing. However, this task has been discussed internally in NVIDIA, assigned to @benhg (from NV), and already has a WIP (#968). Please kindly allow us to complete the work there.

In the future, we kindly encourage you to discuss with us prior to contributing, to ensure we don't run into such conflicts again. Thanks! 🙂

@leofang leofang closed this Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants