-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] Do not lock unconditionally while access queue_iml::MInOrderExternalEvent #17575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Do not lock unconditionally while access queue_iml::MInOrderExternalEvent #17575
Conversation
queue_impl::MMissedCleanupRequests and queue_impl::MInOrderExternalEvent are empty on hot path, check a flag instead or before the locking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I'd prefer it to be two different PRs (in case something would regress, easier bisecting/reverts).
Also, I'd rather see a bit extra boilerplate code via abstracting
class Helper{ atomic flag; other_data b; methods... };
so that we could ensure that whenever b
is set/unset the flag
is always updated properly.
sycl/source/detail/queue_impl.hpp
Outdated
@@ -1040,6 +1047,8 @@ class queue_impl { | |||
// the fallback implementation of profiling info | |||
bool MFallbackProfiling = false; | |||
|
|||
// Is value presented in MInOrderExternalEvent? | |||
std::atomic_bool MHasValueInOrderExternalEvent = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name is confusing, maybe MInOrderExteranEventIsSet
would be better?
Second, is it correct to assume that we still want the optional
because default-constructed/empty event is heavy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second, is it correct to assume that we still want the optional because default-constructed/empty event is heavy?
Yep, one is heavy. I hope one day we able to implement an empty sycl::event
with just empty std::shared_ptr<detail::event_impl> impl
, then optional became unneeded.
sycl/source/detail/queue_impl.cpp
Outdated
@@ -801,6 +802,7 @@ void queue_impl::revisitUnenqueuedCommandsState( | |||
else { | |||
std::lock_guard<std::mutex> RequestLock(MMissedCleanupRequestsMtx); | |||
MMissedCleanupRequests.push_back(CompletedHostTask->getCommandGraph()); | |||
MAreCleanupRequestsMissed.store(true, std::memory_order_release); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the following not an issue:
Thread A Thread B
acquire mtx
check atomic without mutex, empty
push_back
atomic_store_true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this is exactly the question I would like to discuss with the fellow sycl experts!
According to my understanding, the change is correct, because the situation you describing is undistinguishing from a situation when Thread A just staying, say, at the beginning of queue_impl::revisitUnenqueuedCommandsState()
, i.e. there is no reason for Thread B to expect something in MMissedCleanupRequests
at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sergey-semenov , @KseniyaTikhomirova would be much more knowledgeable about this than I am.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted with @KseniyaTikhomirova about this today. Like Alexandr said, the case is indistinguishable from the first thread being earlier in its execution path.
With the synchronization being made less strict, it will be slightly more likely that we'll exit queue::wait
with some dependency related data still present in the queue. But that's not a functional problem since that data will be handled during the next cleanup call.
Sure, will do, after (or if) the correctness issue would be resolved. |
If you are ok with the change, it's turn out 3 PR: with the template addition and 2 with enabling it for in-order event and for missed requests. Is it sound good? |
Adding yet unused helper in a separate PR will mean a unittest needs to be introduced together with it :) I think if you'd have two different PRs that introduce the same helper in otherwise unmodified area of the code, they won't actually cause any merge conflicts, regardless of the order in which they'll go in. |
Interesting. To be honest, I have a different model here. I thought about single PR with 3 commits inside (adding But your solution with 2 PRs is also possible. Are you ok with the code, so I can start splitting it to two PRs? |
We don't do that in this project (I'm not even sure if that's enabled). |
sycl/source/detail/queue_impl.hpp
Outdated
MDataPresent.store(true, std::memory_order_release); | ||
func(MData); | ||
} | ||
template <typename F> void get(F &&func) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think modifications on get
might be unexpected. Would pop
work as a name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Then, it might be reasonable to change put
to pop
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean push
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, typo. Yes, push/pop instead of put/get.
sycl/source/detail/queue_impl.hpp
Outdated
@@ -850,12 +855,12 @@ class queue_impl { | |||
auto EventRet = Handler.finalize(); | |||
EventImplPtr EventRetImpl = getSyclObjImpl(EventRet); | |||
if (Type == CGType::CodeplayHostTask) | |||
Deps.UnenqueuedCmdEvents.push_back(EventRetImpl); | |||
Deps.UnenqueuedCmdEvents.push_back(std::move(EventRetImpl)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSyclObjImpl(EventRet)
vs std::move(EventRetImpl)
is almost equal, I don't think the temporary variable is justified anymore. IMO, inline it and maybe add a comment for the future as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's turn out this was already modified on sycl branch (reference used instead). So, dropped from the patch.
sycl/source/detail/queue_impl.hpp
Outdated
std::lock_guard<std::mutex> Lock(MInOrderExternalEventMtx); | ||
MInOrderExternalEvent = Event; | ||
MInOrderExternalEvent.put( | ||
[&Event](std::optional<event> &InOrderExternalEvent) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't mind [&](auto &InOrderExternalEvent) { ... }
here and similarly elsewhere, but won't insist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I am. @sergey-semenov , any objections? |
@sergey-semenov, what do you think? |
No objections from me. |
Colleagues @aelovikov-intel , @sergey-semenov , what do you think? |
This might be hard to read. Maybe "use a flag to avoid locking when empty"? |
Sure, done. |
queue_impl::MInOrderExternalEvent is empty on hot path, use a flag to avoid locking when empty.