Skip to content

[WIP][UR][CUDA][TEST] Add P2P initialization to multi-device test#21311

Draft
kekaczma wants to merge 2 commits intosyclfrom
multi-device-test
Draft

[WIP][UR][CUDA][TEST] Add P2P initialization to multi-device test#21311
kekaczma wants to merge 2 commits intosyclfrom
multi-device-test

Conversation

@kekaczma
Copy link
Contributor

Initialize P2P access between device pairs in
urEnqueueKernelLaunchIncrementMultiDeviceTest to enable cross-device USM memcpy operations on CUDA.

  • Add urUsmP2PEnablePeerAccessExp calls in SetUp()
  • Add urUsmP2PDisablePeerAccessExp calls in TearDown()
  • Skip P2P for duplicate device handles (single GPU case)
  • Handle already-enabled and unsupported device pairs

This is a test commit to validate the fix on multi-GPU hardware.

Related to #19033

@kekaczma kekaczma changed the title [UR][CUDA][TEST] Add P2P initialization to multi-device test [WIP][UR][CUDA][TEST] Add P2P initialization to multi-device test Feb 18, 2026
Fix CUDA adapter to properly map P2P access errors:
- Map CUDA_ERROR_PEER_ACCESS_ALREADY_ENABLED to UR_RESULT_ERROR_INVALID_OPERATION
- Map CUDA_ERROR_PEER_ACCESS_NOT_ENABLED to UR_RESULT_ERROR_INVALID_OPERATION

Initialize P2P access in urEnqueueKernelLaunchIncrementMultiDeviceTest:
- Add urUsmP2PEnablePeerAccessExp calls in SetUp() for cross-device memcpy
- Add urUsmP2PDisablePeerAccessExp calls in TearDown() for cleanup
- Skip P2P operations for duplicate device handles (single GPU case)
- Accept INVALID_OPERATION for already-enabled or unsupported pairs

This fixes test failures on multi-GPU CUDA systems where P2P must be
explicitly enabled before cross-device USM memory operations.

Fixes #19033
Changes:
- Track enabled P2P pairs in member variable enabledP2PPairs
- SetUp: Only record pairs WE successfully enabled (both SUCCESS)
- TearDown: Disable P2P bidirectionally for our pairs, ignore errors
- Removes global P2P state dependency between test instances

Works for both:
- 2 physical GPUs duplicated 4× (8 logical devices)
- 8 distinct physical GPUs

Fixes #19033
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments