Skip to content

Conversation

@Andy-Jost
Copy link
Contributor

Summary

Adds tests verifying that peer access settings are not preserved when memory resources are shared via IPC, and that buffers imported via IPC can be accessed from peer devices after setting peer access on the imported memory resource.

Related to #479

Changes

  • New test file: tests/memory_ipc/test_peer_access.py

    • TestPeerAccessNotPreservedOnImport: Verifies peer access settings are not preserved on IPC import (as per CUDA documentation section 15.11.2)
    • TestBufferPeerAccessAfterImport: Verifies buffers can be accessed from peer devices after setting peer access, and that access can be revoked
  • Test infrastructure improvements:

    • Move mempool_device_x2 and mempool_device_x3 fixtures to conftest.py for reuse across test files
    • Update device fixtures (ipc_device, mempool_device) to explicitly use Device(0) for consistency
    • Remove duplicate fixture definitions from test_memory_peer_access.py
  • Test refactoring:

    • Refactor test_object_passing to use class-based structure matching other IPC tests

Test Coverage

The tests verify:

  1. Peer access settings are not preserved when a memory resource is sent via IPC (documented CUDA behavior per Programming Guide section 15.11.2)
  2. Peer access can be set after importing a memory resource via IPC
  3. Buffers imported via IPC can be accessed from peer devices after setting peer access
  4. Peer access can be revoked after being granted
  5. Tests run with and without peer access granted in the parent process (parametrized)

Related Work

Testing

All tests pass on 2-GPU system with peer access enabled.

@Andy-Jost Andy-Jost added this to the cuda.core beta 10 milestone Dec 3, 2025
@Andy-Jost Andy-Jost added P0 High priority - Must do! feature New feature or request cuda.core Everything related to the cuda.core module labels Dec 3, 2025
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Dec 3, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Andy-Jost
Copy link
Contributor Author

/ok to test 732f7ae

@github-actions

This comment has been minimized.

- Convert test_object_passing function to TestObjectPassing class
- Follow the same pattern as other IPC tests (test_main/child_main methods)
- Update .cursorrules to note build requirement before running tests
Update test fixtures to set device 0 current (avoids inter-test dependencies with multiple devices available).

- Create tests/memory_ipc/test_peer_access.py with TestPeerAccessNotPreservedOnImport
- Verify peer access settings are not preserved when MR is sent via IPC
- Verify peer access can be set after import
- Move mempool_device_x2/x3 fixtures to conftest.py for reuse
- Update device fixtures to explicitly use Device(0) for consistency
- Remove duplicate fixture definitions from test_memory_peer_access.py
- Add TestBufferPeerAccessAfterImport test class
- Verify buffers imported via IPC can be accessed from peer devices after setting peer access
- Verify peer access can be revoked after being granted
- Parametrize test to run with and without peer access granted in parent process
- Test verifies full lifecycle: import -> grant access -> verify access -> revoke access -> verify revocation
- Uses PatternGen.verify_buffer directly (simpler than manual scratch buffers)
@Andy-Jost
Copy link
Contributor Author

/ok to test 03ef0d1

@Andy-Jost
Copy link
Contributor Author

/ok to test 11edd04

@Andy-Jost
Copy link
Contributor Author

/ok to test 818ea1c

@Andy-Jost Andy-Jost self-assigned this Dec 4, 2025
@Andy-Jost Andy-Jost merged commit 5c42278 into NVIDIA:main Dec 4, 2025
61 checks passed
@Andy-Jost Andy-Jost deleted the ipc-peer-support branch December 4, 2025 14:49
@github-actions
Copy link

github-actions bot commented Dec 4, 2025

Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module feature New feature or request P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants