Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic kokkos #206

Merged
merged 29 commits into from
Jun 22, 2022
Merged

Basic kokkos #206

merged 29 commits into from
Jun 22, 2022

Conversation

ajpowelsnl
Copy link
Contributor

This PR represents a split from PR #188. Here, we provide Kokkos basic variants. Also, we made minimal extensions to the build system for Kokkos integration. Moving forward, we will have a PR for Kokkos variants of each kernel group.

@crtrott
Copy link
Contributor

crtrott commented Jan 24, 2022

@davidbeckingsale hey David can you take a look at this?

CMakeLists.txt Outdated

# The statement below is required for Kokkos compilation.
if(ENABLE_KOKKOS)
include_directories(SYSTEM ${CMAKE_CURRENT_SOURCE_DIR}/tpl/RAJA/include/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this for basic type parameterization that the perf suite inherits from RAJA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We removed the include_directories(SYSTEM ${CMAKE_CURRENT_SOURCE_DIR}/tpl/RAJA/include/) line from this bloc. This line was likely inserted as a hack in the early prototyping stages.

CMakeLists.txt Outdated Show resolved Hide resolved
@crtrott
Copy link
Contributor

crtrott commented Feb 8, 2022

Amy is gonna revert the formatting changes on the couple original files where she applied clang-format accidentally. Other than that I think this is good now.

@ajpowelsnl
Copy link
Contributor Author

Good morning @davidbeckingsale -- The commits this morning (for DAXPY_ATOMIC.hpp, DAXPY_ATOMIC.cpp) suggest significant formatting changes, relative to the version of the files we're comparing to in this PR. This difference is, however, a bit misleading, because I actually took the version of the these files from your most recent develop branch. Just give a shout if you have any questions, and many thanks for your help.

@crtrott
Copy link
Contributor

crtrott commented Feb 21, 2022

@davidbeckingsale @rhornung67 can you take a look at this?

ajpowelsnl and others added 10 commits May 20, 2022 14:54
co-authored-by: Christian Trott <crtrott@sandia.gov>
co-authored-by: David Zoeller Poliakoff <dzpolia@sandia.gov>

Adds optional Kokkos dependency and Kokkos
as a submodule
co-authored-by: Christian Trott <crtrott@sandia.gov>
co-authored-by: David Zoeller Poliakoff <dzpolia@sandia.gov>
co-authored-by: Christian Trott <crtrott@sandia.gov>
co-authored-by: David Zoeller Poliakoff <dzpolia@sandia.gov>

Note: This makes it so that one only compiles the
basic kernel group when Kokkos is enabled.

This required a few judicous ifdefs in some places in common.
Eventually when the Kokkos variants for all the kernel
groups are merged, those ifdefs are not needed.
…e format

This commit effectively removes clang formatting inappropriately applied to
these files.
…rmatted file

This commit effectively reverts clang formatting applied to this file in later commits, thus minimizing changes to the original file.
ajpowelsnl and others added 4 commits June 22, 2022 07:55
Accept this change

Co-authored-by: Jason Burmark <MrBurmark@users.noreply.github.com>
Co-authored-by: Jason Burmark <MrBurmark@users.noreply.github.com>
Suggestion accepted

Co-authored-by: Jason Burmark <MrBurmark@users.noreply.github.com>
Suggestion accepted

Co-authored-by: Jason Burmark <MrBurmark@users.noreply.github.com>
src/CMakeLists.txt Outdated Show resolved Hide resolved
@MrBurmark MrBurmark mentioned this pull request Jun 22, 2022
@artv3
Copy link
Member

artv3 commented Jun 22, 2022

I'm having some trouble configuring with this PR.
I am using cmake version: 3.23.1

and I am getting the following error:

RAJAPerf/build/tpl/kokkos/compile_tests/CMakeFiles/CMakeTmp/CMakeLists.txt:
  Target "cmTC_5076b" requires the language dialect "CXX17" .  But the
  current compiler "GNU" does not support this, or CMake does not know the
  flags to enable it.
  
CMake Error at tpl/kokkos/cmake/kokkos_functions.cmake:917 (TRY_COMPILE):
  Failed to generate test project build system.
Call Stack (most recent call first):
  tpl/kokkos/cmake/kokkos_functions.cmake:937 (KOKKOS_CXX_COMPILER_CUDA_TEST)
  tpl/kokkos/cmake/kokkos_compiler_id.cmake:65 (kokkos_compilation)
  tpl/kokkos/cmake/kokkos_tribits.cmake:204 (INCLUDE)
  tpl/kokkos/CMakeLists.txt:170 (KOKKOS_SETUP_BUILD_ENVIRONMENT)  
  

@artv3
Copy link
Member

artv3 commented Jun 22, 2022

I was able to get to configure by using a newer GNU - 8.3.1, but I then get warnings of the following type:

CMake Warning (dev) in test/CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning

then I get build errors:

RAJAPerf/tpl/kokkos/core/src/Cuda/Kokkos_Cuda_Half.hpp(667): error: more than one conversion function from "Kokkos::Experimental::half_t::impl_type" to a built-in type applies:
            function "__half::operator float() const"
            function "__half::operator short() const"
            function "__half::operator unsigned short() const"
            function "__half::operator int() const"
            function "__half::operator unsigned int() const"
            function "__half::operator long long() const"
            function "__half::operator unsigned long long() const"
            function "__half::operator __nv_bool() const"

@ajpowelsnl
Copy link
Contributor Author

I'm having some trouble configuring with this PR. I am using cmake version: 3.23.1

and I am getting the following error:

RAJAPerf/build/tpl/kokkos/compile_tests/CMakeFiles/CMakeTmp/CMakeLists.txt:
  Target "cmTC_5076b" requires the language dialect "CXX17" .  But the
  current compiler "GNU" does not support this, or CMake does not know the
  flags to enable it.
  
CMake Error at tpl/kokkos/cmake/kokkos_functions.cmake:917 (TRY_COMPILE):
  Failed to generate test project build system.
Call Stack (most recent call first):
  tpl/kokkos/cmake/kokkos_functions.cmake:937 (KOKKOS_CXX_COMPILER_CUDA_TEST)
  tpl/kokkos/cmake/kokkos_compiler_id.cmake:65 (kokkos_compilation)
  tpl/kokkos/cmake/kokkos_tribits.cmake:204 (INCLUDE)
  tpl/kokkos/CMakeLists.txt:170 (KOKKOS_SETUP_BUILD_ENVIRONMENT)  
  

Hmmm... Thanks for bringing this up, Arturo. Let me look into this. I was using gcc/9.1 + cuda/11.2, and it was OK for me. Have you updated your Kokkos submodule recently, i.e., at the project root , git submodule update --recursive --init ? This smells like an out-of-date Kokkos dependency (in RAJAPerf/tpl/kokkos). I will check builds with gcc/8.3.1.

@artv3
Copy link
Member

artv3 commented Jun 22, 2022

I'm having some trouble configuring with this PR. I am using cmake version: 3.23.1
and I am getting the following error:

RAJAPerf/build/tpl/kokkos/compile_tests/CMakeFiles/CMakeTmp/CMakeLists.txt:
  Target "cmTC_5076b" requires the language dialect "CXX17" .  But the
  current compiler "GNU" does not support this, or CMake does not know the
  flags to enable it.
  
CMake Error at tpl/kokkos/cmake/kokkos_functions.cmake:917 (TRY_COMPILE):
  Failed to generate test project build system.
Call Stack (most recent call first):
  tpl/kokkos/cmake/kokkos_functions.cmake:937 (KOKKOS_CXX_COMPILER_CUDA_TEST)
  tpl/kokkos/cmake/kokkos_compiler_id.cmake:65 (kokkos_compilation)
  tpl/kokkos/cmake/kokkos_tribits.cmake:204 (INCLUDE)
  tpl/kokkos/CMakeLists.txt:170 (KOKKOS_SETUP_BUILD_ENVIRONMENT)  
  

Hmmm... Thanks for bringing this up, Arturo. Let me look into this. I was using gcc/9.1 + cuda/11.2, and it was OK for me. Have you updated your Kokkos submodule recently, i.e., at the project root , git submodule update --recursive --init ? This smells like an out-of-date Kokkos dependency (in RAJAPerf/tpl/kokkos). I will check builds with gcc/8.3.1.

Yes I tried: git submodule update --recursive --init no luck on my end.

I am using cuda-11.0.2

@ajpowelsnl
Copy link
Contributor Author

I'm having some trouble configuring with this PR. I am using cmake version: 3.23.1
and I am getting the following error:

RAJAPerf/build/tpl/kokkos/compile_tests/CMakeFiles/CMakeTmp/CMakeLists.txt:
  Target "cmTC_5076b" requires the language dialect "CXX17" .  But the
  current compiler "GNU" does not support this, or CMake does not know the
  flags to enable it.
  
CMake Error at tpl/kokkos/cmake/kokkos_functions.cmake:917 (TRY_COMPILE):
  Failed to generate test project build system.
Call Stack (most recent call first):
  tpl/kokkos/cmake/kokkos_functions.cmake:937 (KOKKOS_CXX_COMPILER_CUDA_TEST)
  tpl/kokkos/cmake/kokkos_compiler_id.cmake:65 (kokkos_compilation)
  tpl/kokkos/cmake/kokkos_tribits.cmake:204 (INCLUDE)
  tpl/kokkos/CMakeLists.txt:170 (KOKKOS_SETUP_BUILD_ENVIRONMENT)  
  

Hmmm... Thanks for bringing this up, Arturo. Let me look into this. I was using gcc/9.1 + cuda/11.2, and it was OK for me. Have you updated your Kokkos submodule recently, i.e., at the project root , git submodule update --recursive --init ? This smells like an out-of-date Kokkos dependency (in RAJAPerf/tpl/kokkos). I will check builds with gcc/8.3.1.

Yes I tried: git submodule update --recursive --init no luck on my end.

I am using cuda-11.0.2

@artv3 -- Here's the configuration I'm using on an NVIDIA server:

# Kokkos Cuda build
cmake \
-DENABLE_KOKKOS=ON \
-DRAJA_PERFSUITE_ENABLE_TESTS=OFF \
-DCUDA_ARCH=sm_70 \
-DENABLE_CUDA=ON \
-DKokkos_ARCH_VOLTA70=ON \
-DCMAKE_CUDA_ARCHITECTURES=70 \
-DCMAKE_BUILD_TYPE=Release .. \

Can you try this configuration? Again, this is with gcc/9.1 + CUDA/11.2

@artv3
Copy link
Member

artv3 commented Jun 22, 2022

Thanks, between that and switching the kokkos branch to master I was able to build. The hash for kokkos I got from git submodule was 2834f94af which gave me the following build error:

src/basic-kokkos/REDUCE3_INT-Kokkos.cpp(58): error: namespace "Kokkos::Experimental" has no member "min"

Running the suite via: ./bin/raja-perf.exe -k Basic_DAXPY
Gives me the following output for the timing-average.csv:

Mean Runtime Report (sec.)  ,  ,  ,  ,  ,  ,  ,                                                                                                                              
Kernel       , Base_Seq , Lambda_Seq , RAJA_Seq , Base_CUDA , Lambda_CUDA , RAJA_CUDA , Kokkos_Lambda                                                                        
Kernel       , default  , default    , default  , block_256 , block_256   , block_256 , default                                                                              
Basic_DAXPY  , 0.303750 ,   0.307818 , 0.309596 ,  0.016320 ,    0.016331 ,  0.016322 ,      0.016293  

Based on run-time it looks like kokkos is running with the CUDA backend. For a fair comparison, it would be nice if the number of threads per block were reported.

@artv3
Copy link
Member

artv3 commented Jun 22, 2022

Could the following be added to a script?

# Kokkos Cuda build
cmake \
-DENABLE_KOKKOS=ON \
-DRAJA_PERFSUITE_ENABLE_TESTS=OFF \
-DCUDA_ARCH=sm_70 \
-DENABLE_CUDA=ON \
-DKokkos_ARCH_VOLTA70=ON \
-DCMAKE_CUDA_ARCHITECTURES=70 \
-DCMAKE_BUILD_TYPE=Release .. \

@ajpowelsnl
Copy link
Contributor Author

Thanks, between that and switching the kokkos branch to master I was able to build. The hash for kokkos I got from git submodule was 2834f94af which gave me the following build error:

src/basic-kokkos/REDUCE3_INT-Kokkos.cpp(58): error: namespace "Kokkos::Experimental" has no member "min"

Running the suite via: ./bin/raja-perf.exe -k Basic_DAXPY Gives me the following output for the timing-average.csv:

Mean Runtime Report (sec.)  ,  ,  ,  ,  ,  ,  ,                                                                                                                              
Kernel       , Base_Seq , Lambda_Seq , RAJA_Seq , Base_CUDA , Lambda_CUDA , RAJA_CUDA , Kokkos_Lambda                                                                        
Kernel       , default  , default    , default  , block_256 , block_256   , block_256 , default                                                                              
Basic_DAXPY  , 0.303750 ,   0.307818 , 0.309596 ,  0.016320 ,    0.016331 ,  0.016322 ,      0.016293  

Based on run-time it looks like kokkos is running with the CUDA backend. For a fair comparison, it would be nice if the number of threads per block were reported.

Yes, I did most of the development on a CUDA machine. I will speak with @crtrott about the possibility of reporting the number of threads (per backend) block. Would it be possible to merge this PR now? We are going on a year now. If this work is not merged soon, then I will not be able to contribute the other Kokkos variants.

@artv3
Copy link
Member

artv3 commented Jun 22, 2022

Thanks, between that and switching the kokkos branch to master I was able to build. The hash for kokkos I got from git submodule was 2834f94af which gave me the following build error:
src/basic-kokkos/REDUCE3_INT-Kokkos.cpp(58): error: namespace "Kokkos::Experimental" has no member "min"
Running the suite via: ./bin/raja-perf.exe -k Basic_DAXPY Gives me the following output for the timing-average.csv:

Mean Runtime Report (sec.)  ,  ,  ,  ,  ,  ,  ,                                                                                                                              
Kernel       , Base_Seq , Lambda_Seq , RAJA_Seq , Base_CUDA , Lambda_CUDA , RAJA_CUDA , Kokkos_Lambda                                                                        
Kernel       , default  , default    , default  , block_256 , block_256   , block_256 , default                                                                              
Basic_DAXPY  , 0.303750 ,   0.307818 , 0.309596 ,  0.016320 ,    0.016331 ,  0.016322 ,      0.016293  

Based on run-time it looks like kokkos is running with the CUDA backend. For a fair comparison, it would be nice if the number of threads per block were reported.

Yes, I did most of the development on a CUDA machine. I will speak with @crtrott about the possibility of reporting the number of threads (per backend) block. Would it be possible to merge this PR now? We are going on a year now. If this work is not merged soon, then I will not be able to contribute the other Kokkos variants.

Approved! #248 (review)

@ajpowelsnl
Copy link
Contributor Author

@artv3 I am looking into this error:

src/basic-kokkos/REDUCE3_INT-Kokkos.cpp(58): error: namespace "Kokkos::Experimental" has no member "min" 

We recently moved min, max, etc. to the Kokkos namespace (and out of Kokkos::Experimental).

@ajpowelsnl
Copy link
Contributor Author

Could the following be added to a script?

# Kokkos Cuda build
cmake \
-DENABLE_KOKKOS=ON \
-DRAJA_PERFSUITE_ENABLE_TESTS=OFF \
-DCUDA_ARCH=sm_70 \
-DENABLE_CUDA=ON \
-DKokkos_ARCH_VOLTA70=ON \
-DCMAKE_CUDA_ARCHITECTURES=70 \
-DCMAKE_BUILD_TYPE=Release .. \

Sure, no problem. I will put it in the scripts directory.

@artv3
Copy link
Member

artv3 commented Jun 22, 2022

@artv3 I am looking into this error:

src/basic-kokkos/REDUCE3_INT-Kokkos.cpp(58): error: namespace "Kokkos::Experimental" has no member "min" 

We recently moved min, max, etc. to the Kokkos namespace (and out of Kokkos::Experimental).

Yes, I think the kokkos submodule has to be updated. It currently points at hash: 2834f94af9b01debf67c1aaa3f0eb0c903d72c8d

@artv3
Copy link
Member

artv3 commented Jun 22, 2022

@artv3 I am looking into this error:

src/basic-kokkos/REDUCE3_INT-Kokkos.cpp(58): error: namespace "Kokkos::Experimental" has no member "min" 

We recently moved min, max, etc. to the Kokkos namespace (and out of Kokkos::Experimental).

Yes, I think the kokkos submodule has to be updated. It currently points at hash: 2834f94af9b01debf67c1aaa3f0eb0c903d72c8d

One way to do this is to go to the kokkos tpl folder and checkout the correct version of kokkos, commit and push the change. Using master worked for me, but perhaps we should use a released version.

@MrBurmark MrBurmark merged commit 7bf4e8c into LLNL:develop Jun 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants