-
Notifications
You must be signed in to change notification settings - Fork 809
[SYCL][Graph] Reuse recording queue when available for finalize #20106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
node_impl &add(std::shared_ptr<dynamic_command_group_impl> &DynCGImpl, | ||
nodes_range Deps); | ||
|
||
std::weak_ptr<sycl::detail::queue_impl> getQueue() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should return std::shared_ptr
instead (or maybe a reference to weak). Creating a copy of the weak_ptr
requies atomics and isn't much faster than creating a std::shared_ptr
(if at all?, AFAIK). As such, creating a copy here followed by the lock/dtor at the use side is like triple atomics vs a single one when returning a shared pointer.
I might be totally wrong here, but that's my best understanding of how those things work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch and thanks for your feedback!
Let's use shared_ptr
since it works better with the common case.
I don't think reference to weak is easily applicable, because we don't necessarily have a weak_ptr
to return.
Even though I like the idea.
af7807d
to
7cfd0dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@intel/llvm-gatekeepers please consider merging |
There are two issues with the current implementation:
finalize()
to common path.finalize()
.