Misleading Kokkos::Cuda::initialize ERROR message when compiled for wrong GPU architecture #1944

wcohen · 2018-12-18T15:30:34Z

I built the kokkos-tutorials Intro-Short/Exercises/02/Solution on a Lenovo P50 laptop with a Maxwell based Quadro M1000M GPU running RHEL7. When run it provides the following misleading error message:

Kokkos::Cuda::initialize ERROR: running kernels compiled for compute capability 0.0 (< 5.0) on device with compute capability 5.0 (>=5.0), this would give incorrect results!
Aborted (core dumped)

The problem is caused by KOKKOS_ARCH set to Volta70 in the Makefile. If KOKKOS_ARCH is changed to Maxwell, the example works properly on the machine. Shouldn't the error message be stating the the kernel has been compiled for 7.0 rather than 0.0?

dsunder · 2018-12-18T17:47:11Z

The problem is that Kokkos runs a kernel to detect the architecture the code was compiled for and the kernel fails to to launch so the architecture is detected incorrectly. The error message is misleading and we should investigate if there is a better way to detect the value of __CUDA__ARCH__ without running a kernel. At a minimum we can improve the error message to say that we were unable to determine the architecture.

ian-bertolacci · 2019-07-10T18:47:05Z

Whats the status of this?
I am getting a similar error:

Kokkos::Cuda::initialize ERROR: running kernels compiled for compute capability 3.5 (< 5.0) on device with compute capability 6.1 (>=5.0), this would give incorrect results!

Output from CUDA deviceQuery:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080"
  CUDA Driver Version / Runtime Version          9.0 / 8.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 8114 MBytes (8508145664 bytes)
  (20) Multiprocessors, (128) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1797 MHz (1.80 GHz)
  Memory Clock rate:                             5005 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 2 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1080
Result = PASS

Are there work-arounds?

dsunder · 2019-07-11T19:35:06Z

@ian-bertolacci To avoid this error you need to correctly set the architecture flag when you configure Kokkos. For a Geforce GTX 1080 the architecture flag should be set to Pascal.

ian-bertolacci · 2019-07-11T20:10:49Z

@dsunder Brilliant! Fixed my blocking issues. Thank you so much for your help.

crtrott · 2020-03-12T15:31:52Z

Fixed the error and warning messages. Note I also strengthened the error criteria. If you compile for X.Y and you run on M.N it will error out for X!=M||Y<N - which is what CUDA requires technically. I.e. you can run code compiled for a older minor architecture revision on a newer minor revision but you can't run across major revisions.

mhoemmen · 2020-03-12T15:37:35Z

@micahahoward @rrdrake @sebrowne @vbrunini FYI this will affect Trilinos' CMake options, once the changes hit Trilinos. We'll need more platform specificity on the architecture choice.

sebrowne · 2020-03-12T17:23:12Z

@micahahoward @rrdrake @sebrowne @vbrunini FYI this will affect Trilinos' CMake options, once the changes hit Trilinos. We'll need more platform specificity on the architecture choice.

Meaning "Pascal60, Volta70, etc.?"

mhoemmen · 2020-03-13T16:48:38Z

@sebrowne wrote:

Meaning "Pascal60, Volta70, etc.?"

Yup -- those warnings will become errors. It may just mean more Trilinos build scripts and/or module options.

sebrowne · 2020-03-13T17:55:39Z

Awesome, thanks

jwwtc · 2022-11-02T18:58:08Z

@ian-bertolacci To avoid this error you need to correctly set the architecture flag when you configure Kokkos. For a Geforce GTX 1080 the architecture flag should be set to Pascal.

How can I find the correct architecture flag?
For example, is there a proper flag for Quadro RTX 4000?

Thank you.

dalg24 · 2022-11-02T19:19:46Z

How can I find the correct architecture flag? For example, is there a proper flag for Quadro RTX 4000?

The NVIDIA specs for that model tell you it has the Turing architecture. You can also search for its "compute capability" which would tel you "7.5"
Then if you look at the available arch options in Kokkos for NVIDIA GPUs you will find Kokkos_ARCH_TURING75

jwwtc · 2023-02-11T17:31:16Z

@dalg24 Thanks. And AMPERE80 is the Kokkos_ARCH for A100, right?

masterleinad · 2023-02-11T18:48:29Z

@dalg24 Thanks. And AMPERE80 is the Kokkos_ARCH for A100, right?

Yes.

crtrott added the Question For Kokkos internal and external contributors and users label Dec 19, 2018

ndellingwood added this to the 2019 April milestone Feb 7, 2019

crtrott assigned dsunder Aug 21, 2019

crtrott added Enhancement Improve existing capability; will potentially require voting and removed Question For Kokkos internal and external contributors and users labels Aug 21, 2019

crtrott removed this from the 2019 April milestone Aug 21, 2019

dsunder added this to the Tentative 3.1 Release milestone Sep 4, 2019

dalg24 assigned crtrott Mar 4, 2020

crtrott added the InDevelop label Mar 12, 2020

crtrott closed this as completed Apr 14, 2020

BenWibking mentioned this issue Jan 24, 2023

No CUDA aware MPI parthenon-hpc-lab/athenapk#15

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misleading Kokkos::Cuda::initialize ERROR message when compiled for wrong GPU architecture #1944

Misleading Kokkos::Cuda::initialize ERROR message when compiled for wrong GPU architecture #1944

wcohen commented Dec 18, 2018

dsunder commented Dec 18, 2018 •

edited

ian-bertolacci commented Jul 10, 2019

dsunder commented Jul 11, 2019

ian-bertolacci commented Jul 11, 2019

crtrott commented Mar 12, 2020

mhoemmen commented Mar 12, 2020

sebrowne commented Mar 12, 2020

mhoemmen commented Mar 13, 2020

sebrowne commented Mar 13, 2020

jwwtc commented Nov 2, 2022

dalg24 commented Nov 2, 2022

jwwtc commented Feb 11, 2023

masterleinad commented Feb 11, 2023

Misleading Kokkos::Cuda::initialize ERROR message when compiled for wrong GPU architecture #1944

Misleading Kokkos::Cuda::initialize ERROR message when compiled for wrong GPU architecture #1944

Comments

wcohen commented Dec 18, 2018

dsunder commented Dec 18, 2018 • edited

ian-bertolacci commented Jul 10, 2019

dsunder commented Jul 11, 2019

ian-bertolacci commented Jul 11, 2019

crtrott commented Mar 12, 2020

mhoemmen commented Mar 12, 2020

sebrowne commented Mar 12, 2020

mhoemmen commented Mar 13, 2020

sebrowne commented Mar 13, 2020

jwwtc commented Nov 2, 2022

dalg24 commented Nov 2, 2022

jwwtc commented Feb 11, 2023

masterleinad commented Feb 11, 2023

dsunder commented Dec 18, 2018 •

edited