Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA 8 has 64bit __shfl #361

Closed
hcedwar opened this issue Jul 20, 2016 · 3 comments
Closed

CUDA 8 has 64bit __shfl #361

hcedwar opened this issue Jul 20, 2016 · 3 comments
Assignees
Labels
Enhancement Improve existing capability; will potentially require voting
Milestone

Comments

@hcedwar
Copy link
Contributor

hcedwar commented Jul 20, 2016

The Kokkos::shfl overloads should use the CUDA 8 intrinsic __shfl for 64bit types as well as 32bit types. Will have to be protected with compiler version detection.
This is an undocumented feature,, wait for CUDA 8.5.

@hcedwar hcedwar added the Enhancement Improve existing capability; will potentially require voting label Jul 20, 2016
@hcedwar hcedwar added this to the Backlog milestone Jul 20, 2016
@hcedwar
Copy link
Contributor Author

hcedwar commented Jun 20, 2017

CUDA 8.5 never happened. Perhaps CUDA 9

@srajama1
Copy link

Checking the status on this.

@ibaned ibaned self-assigned this Mar 14, 2018
@ibaned ibaned modified the milestones: Backlog, 2018 April Mar 14, 2018
@ibaned
Copy link
Contributor

ibaned commented Apr 3, 2018

It looks like CUDA 9's __shfl_sync and friends support long long. Working on adding this to Kokkos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Improve existing capability; will potentially require voting
Projects
No open projects
Development

No branches or pull requests

5 participants