-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixing cublas and rustacuda ? #28
Comments
Could you post a code snippet showing off this failure? I can hack on that.
…On Mon, Jan 21, 2019 at 4:52 PM zeroexcuses ***@***.***> wrote:
Can we please have sample code that
1.
allocates some memory
2.
calls A = B * C
3.
calls some kernel on A
4.
calls sgemm D = E * A
?
I have some tensor code that runs great in CPU mode, but fails in GPU mode
(so the algorithm si correct). All CPU vs GPU unit tests pass -- so it
seems I am running into a synchronization issue.
I am using stream.synchronize on after all kernel calls -- so it seems the
remaining culprit is that kernels on streamA while cublas is on streamB ..
and it's not clear to me how to synchronize the two.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#28>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKUNKJItU6Wwar560RdI7YJznNqf4uvxks5vFjaPgaJpZM4aLlJ2>
.
|
I think I got it working via the following changes:
This appears to cause the blas to run on the same stream as the kernels. However, I'm a bit uneasy as I'm brute force casing a sys::cuda::CUstream* to a sys::cudart::CUstream* I'm not sure about the difference between the two. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Can we please have sample code that
allocates some memory
calls A = B * C
calls some kernel on A
calls sgemm D = E * A
?
I have some tensor code that runs great in CPU mode, but fails in GPU mode (so the algorithm si correct). All CPU vs GPU unit tests pass -- so it seems I am running into a synchronization issue.
I am using stream.synchronize on after all kernel calls -- so it seems the remaining culprit is that kernels on streamA while cublas is on streamB .. and it's not clear to me how to synchronize the two.
The text was updated successfully, but these errors were encountered: