-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CUDA Ada architecture support #5867
Conversation
Can one of the admins verify this patch? |
cmake/kokkos_arch.cmake
Outdated
@@ -866,7 +868,7 @@ IF (KOKKOS_ARCH_VOLTA70 OR KOKKOS_ARCH_VOLTA72) | |||
SET(KOKKOS_ARCH_VOLTA ON) | |||
ENDIF() | |||
|
|||
IF (KOKKOS_ARCH_AMPERE80 OR KOKKOS_ARCH_AMPERE86) | |||
IF (KOKKOS_ARCH_AMPERE80 OR KOKKOS_ARCH_AMPERE86 OR KOKKOS_ARCH_ADA89) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does not look right.
We do not define KOKKOS_ENABLE_VOLTA
when using Turing75 and I fail to see why that is different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may be correct in principle, but my line of thinking is that there is not likely a need for specific optimizations for those consumer grade GPUs and go through the extra maintenance effort. Then grouping this with the closest feature set seems better than not grouping it into any of those at all. On the Kokkos slack it was confirmed by some Nvidia person that this assignment would be acceptable (I forgot who).
Bottom line, as far as I am concerned, you can take this, drop it, or modify it as you see fit. I personally don't see a need for a change and will not follow up any further. I've primarily submitted it since I consider it best practice to submit changes that we include in bundled software to the upstream version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. Will check how other developers feel about this and push changes if requested. Thx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also update pre_initialize_internal()
in core/src/impl/Kokkos_Core.cpp
.
@@ -868,7 +868,7 @@ IF (KOKKOS_ARCH_VOLTA70 OR KOKKOS_ARCH_VOLTA72) | |||
SET(KOKKOS_ARCH_VOLTA ON) | |||
ENDIF() | |||
|
|||
IF (KOKKOS_ARCH_AMPERE80 OR KOKKOS_ARCH_AMPERE86 OR KOKKOS_ARCH_ADA89) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dalg24 Did you go through al all places that check if a CUDA
architecture has been specified? I could quickly find
core/src/Cuda/Kokkos_Cuda_Half_Conversion.hpp:266: !(defined(KOKKOS_ARCH_AMPERE) || defined(KOKKOS_ARCH_HOPPER)))
core/src/Cuda/Kokkos_Cuda_Half_Conversion.hpp:394: ((defined(KOKKOS_ARCH_AMPERE) || defined(KOKKOS_ARCH_HOPPER)))
core/src/OpenACC/Kokkos_OpenACC_Traits.hpp:26: defined(KOKKOS_ARCH_AMPERE) || defined(KOKKOS_ARCH_HOPPER)
core/src/OpenMPTarget/Kokkos_OpenMPTarget_Instance.cpp:97: defined(KOKKOS_ARCH_TURING75) || defined(KOKKOS_ARCH_AMPERE) || \
core/src/SYCL/Kokkos_SYCL.cpp:134: !defined(KOKKOS_ARCH_AMPERE) && !defined(KOKKOS_ARCH_HOPPER)
core/src/SYCL/Kokkos_SYCL_Parallel_Team.hpp:309: defined(KOKKOS_ARCH_TURING75) || defined(KOKKOS_ARCH_AMPERE) || \
core/src/SYCL/Kokkos_SYCL_Parallel_Team.hpp:342: defined(KOKKOS_ARCH_TURING75) || defined(KOKKOS_ARCH_AMPERE) || \
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. It likely needs to be added in the first location you mention, although I expect it would compile w/o it.
For all the others in ACC, OMP, and SYCL, we discussed yesterday adding a KOKKOS_ARCH_NVIDIA_GPU
macro because that is really what was needed there.
Lets take this PR over also in light of the other one introducing general CUDA arch gpu thingy. |
This change adds support to compile Kokkos for Ada generation (sm_89) consumer GPUs (RTX40x0)