-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add launch bounds to TransposeBatch kernel to avoid cudaErrorLaunchOutOfResources #2971
Conversation
…tOfResources - adds launch bounds to TransposeBatch kernel inside fft_postprocess.cuh to make sure that cudaErrorLaunchOutOfResources on some GPUs Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
!build |
CI MESSAGE: [2382032]: BUILD STARTED |
CI MESSAGE: [2382036]: BUILD STARTED |
CI MESSAGE: [2382037]: BUILD STARTED |
CI MESSAGE: [2382037]: BUILD PASSED |
CI MESSAGE: [2382036]: BUILD PASSED |
CI MESSAGE: [2382032]: BUILD FAILED |
CI MESSAGE: [2382032]: BUILD PASSED |
@@ -143,7 +143,9 @@ __global__ void ConvertTimeMajorSpectrogram( | |||
} | |||
|
|||
template <typename Out, typename In, typename Convert = identity> | |||
__global__ void TransposeBatch( | |||
__global__ void | |||
__launch_bounds__(32*kBlock) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__launch_bounds__(32*kBlock) | |
__launch_bounds__(kBlock*kBlock) |
That's how this kernel is invoked in L345 - but now that I look at it, it would make sense to modify it a tiny bit (not in this PR, I guess).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
!build |
CI MESSAGE: [2386250]: BUILD STARTED |
CI MESSAGE: [2386250]: BUILD PASSED |
sure that cudaErrorLaunchOutOfResources on some GPUs
Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com
Why we need this PR?
Pick one, remove the rest
sure that cudaErrorLaunchOutOfResources on some GPUs
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
adds launch bounds to TransposeBatch kernel inside fft_postprocess.cuh to make sure that cudaErrorLaunchOutOfResources on some GPUs
fft_postprocess.cuh
NA
CI
NA
JIRA TASK: [NA]