-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safe to change streams on rocblas_handle
without synchronization?
#1253
Comments
The following link has documentation on Handle, Stream, and Device management. https://docs.amd.com/bundle/rocBLAS-User-Guide---rocBLAS-documentation/page/Programmers_Guide_15.html#handle-stream-and-device-management A paragraph from the documentation is below:
Below is sample code from the documentation:
Is your request that we remove the requirement for the |
Thanks @amcamd for the response! I wasn't aware of that documentation, and it answers my question perfectly. Do you think it'd be worth lifting some of that information into more user-facing documentation (e.g. the documentation for
You've answered my primary request, which was just finding out if our understanding of the behaviour is correct. For consistency with CUDA it would be nice if rocBLAS behaved the same (or at least hipBLAS, but I suppose making hipBLAS behave the same practically means making rocBLAS behave the same...), but I understand that this would likely not be a trivial change. Also, with this knowledge we can already work around this behaviour (with one stream per handle or manually allocated workspaces) so for the time being we're good. From my perspective this issue can be closed (since you've answered my documentation question). |
Thanks @msimberg for comments on improving documentation. We will plan to move some of the content of the "Programmer's Guide" to make it more accessible by users. |
I'm wondering if it is safe to change streams on
rocblas_handle
s without intermediate synchronization? We have a library where we use thread-local handles but more than one stream per thread for concurrency. For each successive use of a handle we go round-robin through the streams and set the stream on the handle. However, we're seeing errors in the results with this setup. Browsing through the rocblas source code and the documentation (e.g. https://rocblas.readthedocs.io/en/latest/API_Reference_Guide.html#device-memory-allocation-in-rocblas) everything seems to point at the fact that it's not safe to change the stream because then concurrent work on different streams may end up using the workspace simultaneously.Can you confirm if this is the case? If yes, is this planned to change? Will the behaviour be explicitly documented? The library is currently already using cuBLAS where it is safe to change the stream to launch concurrent work on the same handle.
The text was updated successfully, but these errors were encountered: