Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda launch bounds performance regression bug #1140

Closed
hcedwar opened this issue Oct 3, 2017 · 3 comments
Closed

Cuda launch bounds performance regression bug #1140

hcedwar opened this issue Oct 3, 2017 · 3 comments
Assignees
Labels
Blocks Promotion Overview issue for release-blocking bugs Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Milestone

Comments

@hcedwar
Copy link
Contributor

hcedwar commented Oct 3, 2017

As noted in #1139 the Cuda launch bounds PR #909 (fixed in #912) introduced a performance regression problem. Revise Cuda launch bounds functionality to not set bounds at all if bounds are not specified - which was the behavior prior to this PR.

@hcedwar hcedwar added the Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) label Oct 3, 2017
@hcedwar hcedwar added this to the 2017 October milestone Oct 3, 2017
@hcedwar
Copy link
Contributor Author

hcedwar commented Oct 3, 2017

impl/Kokkos_AnalyzePolicy.hpp should have a default launch bounds trait of void and then Cuda/Kokkos_CudaExec.hpp should not set the launch bounds parameters when the launch bounds trait is void.

@hcedwar
Copy link
Contributor Author

hcedwar commented Oct 3, 2017

@stanmoore1 confirmed this is the root cause, as noted in #1139.
High priority to fix for the next promotion.

@hcedwar hcedwar added the Blocks Promotion Overview issue for release-blocking bugs label Oct 3, 2017
@hcedwar hcedwar self-assigned this Oct 3, 2017
@mhoemmen
Copy link
Contributor

mhoemmen commented Oct 4, 2017

@prwolfe @rrdrake

hcedwar added a commit that referenced this issue Oct 5, 2017
Set default CUDA launch bounds to <0,0> and when do not use
CUDA __launch_bounds__ unless CUDA launch bounds are explicitly specified.
crtrott added a commit that referenced this issue Oct 5, 2017
Fix for #1139 performance regression bug (and #1140 for tracking).
@crtrott crtrott closed this as completed Oct 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocks Promotion Overview issue for release-blocking bugs Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Projects
None yet
Development

No branches or pull requests

4 participants