Add tests for peer device support with IPC-shared buffers #1308

Andy-Jost · 2025-12-03T19:49:33Z

Summary

Adds tests verifying that peer access settings are not preserved when memory resources are shared via IPC, and that buffers imported via IPC can be accessed from peer devices after setting peer access on the imported memory resource.

Related to #479

Changes

New test file: tests/memory_ipc/test_peer_access.py
- TestPeerAccessNotPreservedOnImport: Verifies peer access settings are not preserved on IPC import (as per CUDA documentation section 15.11.2)
- TestBufferPeerAccessAfterImport: Verifies buffers can be accessed from peer devices after setting peer access, and that access can be revoked
Test infrastructure improvements:
- Move mempool_device_x2 and mempool_device_x3 fixtures to conftest.py for reuse across test files
- Update device fixtures (ipc_device, mempool_device) to explicitly use Device(0) for consistency
- Remove duplicate fixture definitions from test_memory_peer_access.py
Test refactoring:
- Refactor test_object_passing to use class-based structure matching other IPC tests

Test Coverage

The tests verify:

Peer access settings are not preserved when a memory resource is sent via IPC (documented CUDA behavior per Programming Guide section 15.11.2)
Peer access can be set after importing a memory resource via IPC
Buffers imported via IPC can be accessed from peer devices after setting peer access
Peer access can be revoked after being granted
Tests run with and without peer access granted in the parent process (parametrized)

Related Work

Builds on Add peer access control for DeviceMemoryResource #1289 (peer access control for DeviceMemoryResource)
Builds on Allow buffers imported via IPC to be re-exported #1299 (IPC buffer re-export support)
Sets foundation for implementing peer device support for IPC-shared buffers

Testing

All tests pass on 2-GPU system with peer access enabled.

copy-pr-bot · 2025-12-03T19:49:37Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Andy-Jost · 2025-12-03T19:53:30Z

/ok to test 732f7ae

- Convert test_object_passing function to TestObjectPassing class - Follow the same pattern as other IPC tests (test_main/child_main methods) - Update .cursorrules to note build requirement before running tests

Update test fixtures to set device 0 current (avoids inter-test dependencies with multiple devices available). - Create tests/memory_ipc/test_peer_access.py with TestPeerAccessNotPreservedOnImport - Verify peer access settings are not preserved when MR is sent via IPC - Verify peer access can be set after import - Move mempool_device_x2/x3 fixtures to conftest.py for reuse - Update device fixtures to explicitly use Device(0) for consistency - Remove duplicate fixture definitions from test_memory_peer_access.py

- Add TestBufferPeerAccessAfterImport test class - Verify buffers imported via IPC can be accessed from peer devices after setting peer access - Verify peer access can be revoked after being granted - Parametrize test to run with and without peer access granted in parent process - Test verifies full lifecycle: import -> grant access -> verify access -> revoke access -> verify revocation - Uses PatternGen.verify_buffer directly (simpler than manual scratch buffers)

Andy-Jost · 2025-12-03T20:14:55Z

/ok to test 03ef0d1

Andy-Jost · 2025-12-03T20:37:40Z

/ok to test 11edd04

Andy-Jost · 2025-12-03T20:42:51Z

/ok to test 818ea1c

github-actions · 2025-12-04T14:59:23Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

Andy-Jost added this to the cuda.core beta 10 milestone Dec 3, 2025

Andy-Jost added P0 High priority - Must do! feature New feature or request cuda.core Everything related to the cuda.core module labels Dec 3, 2025

Andy-Jost requested review from leofang, rparolin and rwgk December 3, 2025 19:49

This comment has been minimized.

Sign in to view

Andy-Jost added 4 commits December 3, 2025 12:14

Refactor test_object_passing to use class-based structure

6fbc0b3

- Convert test_object_passing function to TestObjectPassing class - Follow the same pattern as other IPC tests (test_main/child_main methods) - Update .cursorrules to note build requirement before running tests

Fix SEGV by explicitly closing CUDA resources in child processes

03ef0d1

Andy-Jost force-pushed the ipc-peer-support branch from 2929d77 to 03ef0d1 Compare December 3, 2025 20:14

Merge remote-tracking branch 'origin/main' into ipc-peer-support

818ea1c

Andy-Jost force-pushed the ipc-peer-support branch from 11edd04 to 818ea1c Compare December 3, 2025 20:42

Andy-Jost self-assigned this Dec 4, 2025

leofang approved these changes Dec 4, 2025

View reviewed changes

Andy-Jost merged commit 5c42278 into NVIDIA:main Dec 4, 2025
61 checks passed

Andy-Jost deleted the ipc-peer-support branch December 4, 2025 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tests for peer device support with IPC-shared buffers #1308

Add tests for peer device support with IPC-shared buffers #1308

Uh oh!

Andy-Jost commented Dec 3, 2025

Uh oh!

copy-pr-bot bot commented Dec 3, 2025

Uh oh!

Andy-Jost commented Dec 3, 2025

Uh oh!

This comment has been minimized.

Andy-Jost commented Dec 3, 2025

Uh oh!

Andy-Jost commented Dec 3, 2025

Uh oh!

Andy-Jost commented Dec 3, 2025

Uh oh!

Uh oh!

github-actions bot commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add tests for peer device support with IPC-shared buffers #1308

Add tests for peer device support with IPC-shared buffers #1308

Uh oh!

Conversation

Andy-Jost commented Dec 3, 2025

Summary

Changes

Test Coverage

Related Work

Testing

Uh oh!

copy-pr-bot bot commented Dec 3, 2025

Uh oh!

Andy-Jost commented Dec 3, 2025

Uh oh!

This comment has been minimized.

Andy-Jost commented Dec 3, 2025

Uh oh!

Andy-Jost commented Dec 3, 2025

Uh oh!

Andy-Jost commented Dec 3, 2025

Uh oh!

Uh oh!

github-actions bot commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants