-
Notifications
You must be signed in to change notification settings - Fork 806
memory resident limit to enabled peers #19257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
ZeOffsetToImageHandleMap; | ||
|
||
// devices which user enabled p2p access by urUsmP2P(Enable|Disable)PeerAccessExp | ||
std::unordered_set<DeviceId> p2pDeviceIds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not a vector?
{ | ||
std::scoped_lock<ur_shared_mutex> Lock(Platform->ContextsMutex); | ||
auto It = std::find(Contexts.begin(), Contexts.end(), hContext); | ||
UR_ASSERT(It != Contexts.end(), UR_RESULT_ERROR_INVALID_CONTEXT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this happen? should this be just a regular assert?
|
||
void cleanupPools(); | ||
void cleanupPoolsForQueue(void *hQueue); | ||
void addResidentDevice(ur_device_handle_t hDevice, ur_device_handle_t peerDevice); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this function called from?
continue; | ||
} | ||
|
||
if (!p2pDevicesEnabledByUser.has_value()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this outside of the loop?
{ | ||
std::scoped_lock<ur_shared_mutex> Lock(Platform->ContextsMutex); | ||
for (auto Context : Platform->Contexts) { | ||
Context->addResidentDevice(commandDevice, peerDevice); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if some context does not contain commandDevice
?
|
||
assert(0 = std::count_if( | ||
std::begin(pDevices), std::end(pDevices), | ||
[&](const auto pDevice) { return newPeerDevice->Id == pDevice->Id; })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuCtxEnablePeerAccess returns CUDA_ERROR_PEER_ACCESS_ALREADY_ENABLED
. Could we have this error mapped to a UR error? It would help make the peer-to-peer SYCL API more consistent (See #19787.)
ab67572
to
f1124c5
Compare
By default, the Level Zero adapter makes all device allocations resident on all "peer" devices, even if the application doesn't use those peer devices at all. USM residency adds significant overhead.
For v2 unfied-runtime adapter we switch in this PR to not calling zeContextMakeMemoryResident on all devices in the context, just the one that was specified during allocation and those specified using ext_oneapi_enable_peer_access.
When ext_oneapi_enable_peer_access / ext_oneapi_disable_peer_access is called, we retroactively call zeContextMakeMemoryResident or zeContextEvictMemory respectively, on all the active pages in the heap. This is done by calling ctl call in UMF drafted here: oneapi-src/unified-memory-framework@e6d5043
UMF part was done in oneapi-src/unified-memory-framework#1517