Fix Atomic{Min,Max} for Kepler30 #3780

janciesko · 2021-02-04T16:59:43Z

Allows to maintain support for Kepler (Compute Capability 3.0)
Uses fallbacks for atomic min and max ops.

masterleinad · 2021-02-04T18:58:15Z

core/src/Cuda/Kokkos_Cuda_View.hpp

@@ -139,7 +139,7 @@ struct CudaLDGFetch {

  template <typename iType>
  KOKKOS_INLINE_FUNCTION ValueType operator[](const iType& i) const {
-#ifdef __CUDA_ARCH__
+#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ARCH_KEPLER30)


What about KOKKOS_ARCH_KEPLER32?

KOKKOS_ARCH_KEPLER32 works a no changes needed.

dalg24

You forgot Kepler32

dalg24 · 2021-02-04T19:01:23Z

core/src/Cuda/Kokkos_Cuda_View.hpp

@@ -139,7 +139,7 @@ struct CudaLDGFetch {

  template <typename iType>
  KOKKOS_INLINE_FUNCTION ValueType operator[](const iType& i) const {
-#ifdef __CUDA_ARCH__
+#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ARCH_KEPLER30)


Suggested change

#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ARCH_KEPLER30)

#if defined(__CUDA_ARCH__) && (__CUDA_ARCH__ >= 350)

The documentation says "only supported by devices of compute capability 3.5 and higher"

dalg24 · 2021-02-04T19:04:37Z

core/src/impl/Kokkos_Atomic_MinMax.hpp

@@ -101,6 +101,52 @@ inline __host__ unsigned long long int atomic_fetch_max(

 #endif

+#if defined(KOKKOS_ARCH_KEPLER30)


Suggested change

#if defined(KOKKOS_ARCH_KEPLER30)

#if defined(__CUDA_ARCH__) && (__CUDA_ARCH__ < 350)

same as above

dalg24 · 2021-02-04T19:06:32Z

core/src/impl/Kokkos_Atomic_MinMax.hpp

+                                 dest, val);
+}
+
+#else  //(!KOKKOS_ARCH_KEPLER30)


Suggested change

#else //(!KOKKOS_ARCH_KEPLER30)

#else // supported by devices of compute capability 3.5 and higher

dalg24 · 2021-02-04T19:07:01Z

core/src/impl/Kokkos_Atomic_MinMax.hpp

@@ -178,6 +226,52 @@ inline __host__ unsigned long long int atomic_max_fetch(
 }
 #endif

+#if defined(KOKKOS_ARCH_KEPLER30)


Suggested change

#if defined(KOKKOS_ARCH_KEPLER30)

#if defined(__CUDA_ARCH__) && (__CUDA_ARCH__ < 350)

dalg24 · 2021-02-04T19:07:29Z

core/src/impl/Kokkos_Atomic_MinMax.hpp

+                                 dest, val);
+}
+
+#else  //(!KOKKOS_ARCH_KEPLER30)


Suggested change

#else //(!KOKKOS_ARCH_KEPLER30)

#else // supported by devices of compute capability 3.5 and higher

crtrott · 2021-02-05T17:28:32Z

Ok checked the 9.0 documentation: it looks like its just 64 bit integer atomics which are not supported pre 3.5, 32 bit ones seem to be fine (min/max/or/xor etc.)

janciesko mentioned this pull request Feb 4, 2021

Remove support for CUDA devices with compute capability less than 3.5 #3764

Closed

janciesko requested a review from masterleinad February 4, 2021 17:16

masterleinad reviewed Feb 4, 2021

View reviewed changes

dalg24 requested changes Feb 4, 2021

View reviewed changes

janciesko requested a review from dalg24 February 10, 2021 18:33

Fix Atomic{Min,Max} for Kepler30

b968927

janciesko force-pushed the FixMinMaxKepler30 branch from 43872e1 to b968927 Compare February 10, 2021 20:27

dalg24 approved these changes Feb 10, 2021

View reviewed changes

DavidPoliakoff approved these changes Feb 11, 2021

View reviewed changes

dalg24 merged commit ee8ff17 into kokkos:develop Feb 11, 2021

dalg24 mentioned this pull request Feb 11, 2021

Bump minimum supported CUDA compute capability to >=3.5 #3761

Closed

dalg24 mentioned this pull request Mar 22, 2022

CUDA LDG fetch never happens due to typo _CUDA_ARCH__ #4892

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Atomic{Min,Max} for Kepler30 #3780

Fix Atomic{Min,Max} for Kepler30 #3780

janciesko commented Feb 4, 2021 •

edited

masterleinad Feb 4, 2021

janciesko Feb 4, 2021

dalg24 left a comment •

edited

dalg24 Feb 4, 2021

janciesko Feb 4, 2021

dalg24 Feb 4, 2021

dalg24 Feb 4, 2021

janciesko Feb 4, 2021

dalg24 Feb 4, 2021

dalg24 Feb 4, 2021

dalg24 Feb 4, 2021

crtrott commented Feb 5, 2021

	#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ARCH_KEPLER30)
	#if defined(__CUDA_ARCH__) && (__CUDA_ARCH__ >= 350)

		@@ -101,6 +101,52 @@ inline __host__ unsigned long long int atomic_fetch_max(

		#endif

		#if defined(KOKKOS_ARCH_KEPLER30)

	#else //(!KOKKOS_ARCH_KEPLER30)
	#else // supported by devices of compute capability 3.5 and higher

Fix Atomic{Min,Max} for Kepler30 #3780

Fix Atomic{Min,Max} for Kepler30 #3780

Conversation

janciesko commented Feb 4, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalg24 left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crtrott commented Feb 5, 2021

janciesko commented Feb 4, 2021 •

edited

dalg24 left a comment •

edited