Skip to content

[SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter#21944

Draft
ldorau wants to merge 7 commits into
intel:syclfrom
ldorau:SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter
Draft

[SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter#21944
ldorau wants to merge 7 commits into
intel:syclfrom
ldorau:SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter

Conversation

@ldorau
Copy link
Copy Markdown
Contributor

@ldorau ldorau commented May 6, 2026

It adds a SYCL e2e test to #21889

@ldorau ldorau force-pushed the SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter branch 2 times, most recently from 07849b5 to 9bdacf8 Compare May 7, 2026 08:11
ldorau added 2 commits May 7, 2026 09:09
- Skip peers with disabled P2P in makeProvider (USM pool creation)
- Add urUsmP2PEnablePeerAccessExp / urUsmP2PDisablePeerAccessExp
- Track per-device peer status in ur_device_handle_t_::peers[]
- Update existing USM pool residency on P2P enable/disable

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
- Fill in three placeholder multi-device tests in memory_residency.cpp
- Tests verify P2P-driven residency: absent-on-peer without P2P,
  enable/disable state machine checks, end-to-end data transfer

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
@ldorau ldorau force-pushed the SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter branch from 9bdacf8 to 16546c8 Compare May 8, 2026 09:30
@ldorau ldorau changed the title [DRAFT] [SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter [SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter May 11, 2026
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any check that now we can not access allocations on srcDev?

I see only one memcpy in time when it was enabled

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lslusarczyk Added. Done.

ldorau added 2 commits May 11, 2026 13:07
Extract common logic from ext_oneapi_enable_peer_access and
ext_oneapi_disable_peer_access into a templated p2pAccessHelper
function to avoid code duplication.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The disablePeerAccessStateMachineAndSourceAllocationPersists test was
failing intermittently because deferred frees from the preceding test
complete asynchronously, causing UR_DEVICE_INFO_GLOBAL_MEM_FREE to
report more free memory than the baseline captured at the start of the
test.

Remove the unreliable source-device free-memory assertion and the
allocation it required, keeping only the state-machine checks (disable
succeeds, double-disable returns UR_RESULT_ERROR_INVALID_OPERATION).
The source-device allocation property is already covered by
allocatingDeviceMemoryWillResultInOOM which runs first in isolation.
@ldorau ldorau force-pushed the SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter branch from 16546c8 to 51cabf7 Compare May 11, 2026 14:58
@ldorau ldorau requested a review from lslusarczyk May 11, 2026 15:00
ldorau added 3 commits May 11, 2026 15:58
Adds sycl/test-e2e/USM/P2P/p2p_usm_residency.cpp to verify that
the Level Zero v2 adapter restricts USM device memory residency to
only those peer devices for which P2P access has been explicitly
enabled via ext_oneapi_enable_peer_access.

Phase 1 (P2P disabled): allocates 1 MB on dev0 and checks that
dev1 free memory does not decrease, proving the allocation is not
made resident on dev1.

Phase 2 (P2P enabled): allocates 1 MB on dev0 and checks that
dev1 free memory decreases by at least the allocation size,
proving the allocation is resident on dev1.

Also adds the 'two-or-more-gpu-devices' lit feature to
lit.cfg.py, set when sycl-ls reports at least two GPU devices.
The test uses this feature to skip on single-GPU machines.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
…isable

Add Phase 3 to p2p_usm_residency.cpp that enables then disables P2P
access from dev1 to dev0, then attempts a memcpy via dev1's queue.
The test passes if the memcpy throws an exception or if the copied
data does not match the original fill pattern, confirming that
ext_oneapi_disable_peer_access actually revokes access.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants