Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Do not lock unconditionally while access queue_iml members #17575

Open
wants to merge 1 commit into
base: sycl
Choose a base branch
from

Conversation

Alexandr-Konovalov
Copy link
Contributor

queue_impl::MMissedCleanupRequests and queue_impl::MInOrderExternalEvent are empty on hot path, check a flag instead or before the locking.

queue_impl::MMissedCleanupRequests and queue_impl::MInOrderExternalEvent are
empty on hot path, check a flag instead or before the locking.
Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I'd prefer it to be two different PRs (in case something would regress, easier bisecting/reverts).

Also, I'd rather see a bit extra boilerplate code via abstracting
class Helper{ atomic flag; other_data b; methods... };

so that we could ensure that whenever b is set/unset the flag is always updated properly.

@@ -1040,6 +1047,8 @@ class queue_impl {
// the fallback implementation of profiling info
bool MFallbackProfiling = false;

// Is value presented in MInOrderExternalEvent?
std::atomic_bool MHasValueInOrderExternalEvent = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name is confusing, maybe MInOrderExteranEventIsSet would be better?

Second, is it correct to assume that we still want the optional because default-constructed/empty event is heavy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second, is it correct to assume that we still want the optional because default-constructed/empty event is heavy?

Yep, one is heavy. I hope one day we able to implement an empty sycl::event with just empty std::shared_ptr<detail::event_impl> impl, then optional became unneeded.

@@ -801,6 +802,7 @@ void queue_impl::revisitUnenqueuedCommandsState(
else {
std::lock_guard<std::mutex> RequestLock(MMissedCleanupRequestsMtx);
MMissedCleanupRequests.push_back(CompletedHostTask->getCommandGraph());
MAreCleanupRequestsMissed.store(true, std::memory_order_release);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the following not an issue:

Thread A        Thread B
acquire mtx
                check atomic without mutex, empty
push_back
atomic_store_true

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is exactly the question I would like to discuss with the fellow sycl experts!

According to my understanding, the change is correct, because the situation you describing is undistinguishing from a situation when Thread A just staying, say, at the beginning of queue_impl::revisitUnenqueuedCommandsState(), i.e. there is no reason for Thread B to expect something in MMissedCleanupRequests at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergey-semenov , @KseniyaTikhomirova would be much more knowledgeable about this than I am.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted with @KseniyaTikhomirova about this today. Like Alexandr said, the case is indistinguishable from the first thread being earlier in its execution path.
With the synchronization being made less strict, it will be slightly more likely that we'll exit queue::wait with some dependency related data still present in the queue. But that's not a functional problem since that data will be handled during the next cleanup call.

@Alexandr-Konovalov
Copy link
Contributor Author

Honestly, I'd prefer it to be two different PRs (in case something would regress, easier bisecting/reverts).

Sure, will do, after (or if) the correctness issue would be resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants