Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CUDA Ada architecture support #5867

Merged
merged 3 commits into from
Mar 8, 2023

Conversation

akohlmey
Copy link
Contributor

@akohlmey akohlmey commented Feb 8, 2023

This change adds support to compile Kokkos for Ada generation (sm_89) consumer GPUs (RTX40x0)

@dalg24-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

@@ -866,7 +868,7 @@ IF (KOKKOS_ARCH_VOLTA70 OR KOKKOS_ARCH_VOLTA72)
SET(KOKKOS_ARCH_VOLTA ON)
ENDIF()

IF (KOKKOS_ARCH_AMPERE80 OR KOKKOS_ARCH_AMPERE86)
IF (KOKKOS_ARCH_AMPERE80 OR KOKKOS_ARCH_AMPERE86 OR KOKKOS_ARCH_ADA89)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not look right.
We do not define KOKKOS_ENABLE_VOLTA when using Turing75 and I fail to see why that is different.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

Copy link
Contributor Author

@akohlmey akohlmey Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be correct in principle, but my line of thinking is that there is not likely a need for specific optimizations for those consumer grade GPUs and go through the extra maintenance effort. Then grouping this with the closest feature set seems better than not grouping it into any of those at all. On the Kokkos slack it was confirmed by some Nvidia person that this assignment would be acceptable (I forgot who).

Bottom line, as far as I am concerned, you can take this, drop it, or modify it as you see fit. I personally don't see a need for a change and will not follow up any further. I've primarily submitted it since I consider it best practice to submit changes that we include in bundled software to the upstream version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Will check how other developers feel about this and push changes if requested. Thx.

Copy link
Contributor

@masterleinad masterleinad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also update pre_initialize_internal() in core/src/impl/Kokkos_Core.cpp.

@@ -868,7 +868,7 @@ IF (KOKKOS_ARCH_VOLTA70 OR KOKKOS_ARCH_VOLTA72)
SET(KOKKOS_ARCH_VOLTA ON)
ENDIF()

IF (KOKKOS_ARCH_AMPERE80 OR KOKKOS_ARCH_AMPERE86 OR KOKKOS_ARCH_ADA89)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dalg24 Did you go through al all places that check if a CUDA architecture has been specified? I could quickly find

core/src/Cuda/Kokkos_Cuda_Half_Conversion.hpp:266:     !(defined(KOKKOS_ARCH_AMPERE) || defined(KOKKOS_ARCH_HOPPER)))
core/src/Cuda/Kokkos_Cuda_Half_Conversion.hpp:394:    ((defined(KOKKOS_ARCH_AMPERE) || defined(KOKKOS_ARCH_HOPPER)))
core/src/OpenACC/Kokkos_OpenACC_Traits.hpp:26:    defined(KOKKOS_ARCH_AMPERE) || defined(KOKKOS_ARCH_HOPPER)
core/src/OpenMPTarget/Kokkos_OpenMPTarget_Instance.cpp:97:    defined(KOKKOS_ARCH_TURING75) || defined(KOKKOS_ARCH_AMPERE) || \
core/src/SYCL/Kokkos_SYCL.cpp:134:    !defined(KOKKOS_ARCH_AMPERE) && !defined(KOKKOS_ARCH_HOPPER)
core/src/SYCL/Kokkos_SYCL_Parallel_Team.hpp:309:    defined(KOKKOS_ARCH_TURING75) || defined(KOKKOS_ARCH_AMPERE) || \
core/src/SYCL/Kokkos_SYCL_Parallel_Team.hpp:342:    defined(KOKKOS_ARCH_TURING75) || defined(KOKKOS_ARCH_AMPERE) || \

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. It likely needs to be added in the first location you mention, although I expect it would compile w/o it.
For all the others in ACC, OMP, and SYCL, we discussed yesterday adding a KOKKOS_ARCH_NVIDIA_GPU macro because that is really what was needed there.

@crtrott
Copy link
Member

crtrott commented Mar 7, 2023

Lets take this PR over also in light of the other one introducing general CUDA arch gpu thingy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants