Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kokkos backend in ADIOS2 #3446

Merged
merged 6 commits into from Feb 21, 2023
Merged

Conversation

anagainaru
Copy link
Contributor

@anagainaru anagainaru commented Jan 27, 2023

This PR contains several things that allows for ADIOS2 to have either a CUDA or a KOKKOS backend

  1. ADIOS2 will use a single GPU backend at a given time (there is currently no compiler that can support multiple so there is no point in supporting this). This means the MemorySpace can only be: Detect, Host or GPU and based on what is enabled in ADIOS2 (CUDA or Kokkos) we use specific functions. We have an environmental variable ADIOS2_HAVE_GPU_SUPPORT will be true if any GPU backend is activated.
  • If CUDA is enabled then either ADIOS2_HAVE_CUDA or ADIOS2_HAVE_Kokkos_CUDA are true.
  1. All GPU related functions have been moved so they are all stored in adios2::helper::adios{BACKEND}

  2. The GPU logic is moved as low as possible for MinMax (in helper::adiosMath and helper::adiosMemeory) so that we capture all configurations (RandomAccess, Spans, AOS, etc.)

  3. Option to build ADIOS2 with Kokkos and use the Kokkos backend (currently supported HIP and CUDA). This requires to build ADIOS2 with CXX_STANDARD=17 so I modified the cmake file to be able to do that. I added a script to build ADIOS2 with Kokkos on Summit here: https://github.com/anagainaru/ADIOS2-addons/blob/main/GPUAware/ADIOS.kokkos/biuld-adios-kokkos-summit.sh. I'll add something similar for Crusher, I'm not sure where to put these in the ADIOS2 repo (maybe in the documentation when I add it).

  • Kokkos 3.7 is required

eisenhauer
eisenhauer previously approved these changes Jan 28, 2023
Copy link
Member

@eisenhauer eisenhauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to the preprocessor conditional in KokkosView.h look a bit odd, switching from defined to define and the && to || seems not to fit in. But the rest looks good.

@anagainaru
Copy link
Contributor Author

The changes to the preprocessor conditional in KokkosView.h look a bit odd, switching from defined to define and the && to || seems not to fit in. But the rest looks good.

Yes, my mistake, none of the testing looked at that, I corrected the bug. The operator change (|| instead of &&) is because we cannot build ADIOS2 with both CUDA and Kokkos backend so the two macro are never true. I still need to test better the scenario when we use ADIOS2 with a Kokkos application, but CUDA applications work (I'll add an example in Kokkos-Examples) and update the documentation

@anagainaru anagainaru force-pushed the gpu-reorg-kokkos branch 2 times, most recently from c3b9b9f to 3ec03ef Compare January 29, 2023 15:17
@caitlinross
Copy link
Collaborator

@anagainaru I was finally able to reproduce this in the container I'm running locally. Add this change in a commit:

--- a/scripts/ci/cmake-v2/ci-el8-cuda-serial.cmake
+++ b/scripts/ci/cmake-v2/ci-el8-cuda-serial.cmake
@@ -1,7 +1,7 @@
 # Client maintainer: vicente.bolea@kitware.com
 
 set(ENV{CC}  gcc)
-set(ENV{CXX} g++)
+set(ENV{CXX} nvcc_wrapper)
 set(ENV{FC}  gfortran)
 
 set(dashboard_cache "

That seems to be all that is needed.

@anagainaru
Copy link
Contributor Author

@caitlinross Great ! Testing this now

@vicentebolea
Copy link
Collaborator

@anagainaru should this go into the 2.9 release?

Copy link
Collaborator

@vicentebolea vicentebolea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @anagainaru for waiting for my review. It looks great, I like that you are making the code more extensible. This PR brings many changes, I commented some of those. In general it looks good, but still I can see that it is still WIP.

examples/cuda/CMakeLists.txt Outdated Show resolved Hide resolved
source/adios2/CMakeLists.txt Outdated Show resolved Hide resolved
scripts/ci/cmake-v2/ci-el8-cuda-serial.cmake Outdated Show resolved Hide resolved
source/adios2/core/ADIOS.cpp Outdated Show resolved Hide resolved
source/adios2/toolkit/format/bp5/BP5Serializer.cpp Outdated Show resolved Hide resolved
testing/adios2/engine/bp/CMakeLists.txt Outdated Show resolved Hide resolved
@anagainaru
Copy link
Contributor Author

@anagainaru should this go into the 2.9 release?

Welcome back from vacation! Hopefully it will go in the release. I'll leave comments to your questions in the corresponding places.

@vicentebolea vicentebolea added this to the v2.9.0 milestone Feb 7, 2023
@anagainaru anagainaru force-pushed the gpu-reorg-kokkos branch 4 times, most recently from 63de969 to 5acf962 Compare February 9, 2023 23:49
@vicentebolea
Copy link
Collaborator

@anagainaru can you allow other developers to push to this branch, I have changes I want to propose to help with this :)

@vicentebolea
Copy link
Collaborator

diff --git a/.github/workflows/everything.yml b/.github/workflows/everything.yml
index bf00d264b..3ed2e0908 100644
--- a/.github/workflows/everything.yml
+++ b/.github/workflows/everything.yml
@@ -97,7 +97,7 @@ jobs:
       image: ornladios/adios2:ci-spack-el8-${{ matrix.compiler }}-${{ matrix.parallel }}
       options: --shm-size=1g
       env:
-        GH_YML_JOBNAME: ${{ matrix.os }}-${{ matrix.compiler }}-${{ matrix.parallel }}
+        GH_YML_JOBNAME: ${{ matrix.os }}-${{ matrix.gpu_backend }}${{ matrix.compiler }}-${{ matrix.parallel }}
         GH_YML_BASE_OS: Linux
         GH_YML_MATRIX_OS: ${{ matrix.os }}
         GH_YML_MATRIX_COMPILER: ${{ matrix.compiler }}
@@ -114,6 +114,11 @@ jobs:
             compiler: cuda
             parallel: serial
             constrains: build_only
+          - os: el8
+            compiler: cuda
+            parallel: serial
+            gpu_backend: kokkos-
+            constrains: build_only
           - os: el8
             compiler: gcc10
             parallel: mpich
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 9f9c84cb7..808c8b8d2 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -149,8 +149,8 @@ adios_option(SZ         "Enable support for SZ transforms" AUTO)
 adios_option(LIBPRESSIO "Enable support for LIBPRESSIO transforms" AUTO)
 adios_option(MGARD      "Enable support for MGARD transforms" AUTO)
 adios_option(PNG        "Enable support for PNG transforms" AUTO)
-adios_option(CUDA       "Enable support for Cuda" AUTO)
-adios_option(Kokkos     "Enable support for Kokkos" AUTO)
+adios_option(CUDA       "Enable support for Cuda" OFF)
+adios_option(Kokkos     "Enable support for Kokkos" OFF)
 adios_option(MPI        "Enable support for MPI" AUTO)
 adios_option(DAOS       "Enable support for DAOS" AUTO)
 adios_option(DataMan    "Enable support for DataMan" AUTO)
diff --git a/cmake/DetectOptions.cmake b/cmake/DetectOptions.cmake
index 70a46a598..8fe512467 100644
--- a/cmake/DetectOptions.cmake
+++ b/cmake/DetectOptions.cmake
@@ -170,6 +170,10 @@ endif()
 
 set(mpi_find_components C)
 
+if(ADIOS_USE_Kokkos AND ADIOS_USE_CUDA)
+  message(FATAL_ERROR "ADIOS2_USE_Kokkos is incompatible with ADIOS_USE_CUDA")
+endif()
+
 # Kokkos
 if(ADIOS2_USE_Kokkos)
   if(ADIOS2_USE_Kokkos STREQUAL AUTO)
@@ -177,20 +181,16 @@ if(ADIOS2_USE_Kokkos)
   else()
     find_package(Kokkos REQUIRED)
   endif()
-  if(NOT Kokkos_ENABLE_CUDA AND (ADIOS2_USE_CUDA STREQUAL ON))
-    set(ADIOS2_USE_Kokkos FALSE)
-  else()
-    if(Kokkos_FOUND)
-      set(ADIOS2_HAVE_Kokkos TRUE)
-    endif()
+  if(Kokkos_FOUND)
+    set(ADIOS2_HAVE_Kokkos TRUE)
     if(Kokkos_ENABLE_CUDA OR Kokkos_ENABLE_HIP OR Kokkos_ENABLE_SYCL)
-	    set(ADIOS2_HAVE_GPU_Support TRUE)
+      set(ADIOS2_HAVE_GPU_Support TRUE)
     endif()
   endif()
 endif()
 
 # CUDA
-if(ADIOS2_USE_CUDA AND NOT Kokkos_ENABLE_CUDA)
+if(ADIOS2_USE_CUDA)
   include(CheckLanguage)
   check_language(CUDA)
   if(ADIOS2_USE_CUDA STREQUAL AUTO)
@@ -199,6 +199,7 @@ if(ADIOS2_USE_CUDA AND NOT Kokkos_ENABLE_CUDA)
     find_package(CUDAToolkit REQUIRED)
   endif()
 endif()
+
 if(CMAKE_CUDA_COMPILER AND CUDAToolkit_FOUND)
   enable_language(CUDA)
   set(ADIOS2_HAVE_CUDA TRUE)
diff --git a/scripts/ci/cmake-v2/ci-el8-kokkos-cuda-serial.cmake b/scripts/ci/cmake-v2/ci-el8-kokkos-cuda-serial.cmake
new file mode 100644
index 000000000..d80a0c72b
--- /dev/null
+++ b/scripts/ci/cmake-v2/ci-el8-kokkos-cuda-serial.cmake
@@ -0,0 +1,26 @@
+# Client maintainer: vicente.bolea@kitware.com
+
+set(ENV{CC}  gcc)
+set(ENV{CXX} g++)
+set(ENV{FC}  gfortran)
+
+set(dashboard_cache "
+ADIOS2_USE_BZip2:BOOL=ON
+ADIOS2_USE_Blosc:BOOL=ON
+ADIOS2_USE_DataMan:BOOL=ON
+ADIOS2_USE_Fortran:BOOL=ON
+ADIOS2_USE_HDF5:BOOL=ON
+ADIOS2_USE_Python:BOOL=ON
+ADIOS2_USE_SZ:BOOL=ON
+ADIOS2_USE_ZeroMQ:STRING=ON
+ADIOS2_USE_ZFP:BOOL=ON
+ADIOS2_USE_Kokkos:BOOL=ON
+ADIOS2_USE_MPI:BOOL=OFF
+CMAKE_C_FLAGS:STRING=-Wall
+CMAKE_CXX_FLAGS:STRING=-Wall
+CMAKE_Fortran_FLAGS:STRING=-Wall
+")
+
+set(CTEST_CMAKE_GENERATOR "Ninja")
+list(APPEND CTEST_UPDATE_NOTES_FILES "${CMAKE_CURRENT_LIST_FILE}")
+include(${CMAKE_CURRENT_LIST_DIR}/ci-common.cmake)
diff --git a/source/adios2/CMakeLists.txt b/source/adios2/CMakeLists.txt
index e1bbe0c26..9a82d0f69 100644
--- a/source/adios2/CMakeLists.txt
+++ b/source/adios2/CMakeLists.txt
@@ -124,9 +124,12 @@ if(ADIOS2_HAVE_CUDA)
   set(maybe_adios2_core_cuda adios2_core_cuda)
 endif()
 
+set(maybe_adios2_core_kokkos)
 if(ADIOS2_HAVE_Kokkos)
-  target_sources(adios2_core PRIVATE helper/adiosKokkos.h helper/adiosKokkos.cpp)
-  target_link_libraries(adios2_core PRIVATE Kokkos::kokkos)
+  target_sources(adios2_core_kokkos PRIVATE helper/adiosKokkos.h helper/adiosKokkos.cpp)
+  target_link_libraries(adios2_core_kokkos PRIVATE Kokkos::kokkos)
+  target_link_libraries(adios2_core PRIVATE adios2_core_kokkos)
+  set(maybe_adios2_core_kokkos adios2_core_kokkos)
 endif()
 
 target_include_directories(adios2_core
@@ -417,7 +420,7 @@ install(DIRECTORY toolkit/
 )
 
 # Library installation
-install(TARGETS adios2_core ${maybe_adios2_core_mpi} ${maybe_adios2_core_cuda} EXPORT adios2Exports
+install(TARGETS adios2_core ${maybe_adios2_core_mpi} ${maybe_adios2_core_cuda} ${maybe_adios2_core_kokkos} EXPORT adios2Exports
   RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT adios2_core-runtime
   LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR} COMPONENT adios2_core-libraries NAMELINK_COMPONENT adios2_core-development
   ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR} COMPONENT adios2_core-development

@vicentebolea
Copy link
Collaborator

vicentebolea commented Feb 14, 2023

@anagainaru you can git cherry-pick the tip commit in #3480 that would fix your CI. If it works for you please:

  • Create a single commit adding the CI stuff.
  • Create another commit adding the feature.
  • Rebase onto the commit that has the tag v2.0.0-rc1
  • Change the base branch in this PR to release_29.

I will do the backport to master later.

You can add me as co-author by writing the commits message co-authored-by: vicente.bolea@kitware.com as shown here (https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors)

@vicentebolea
Copy link
Collaborator

@anagainaru I added the new image with Kokkos 3.7, please amend last commit and push to trigger the CI pipeline.

@vicentebolea
Copy link
Collaborator

@anagainaru I added the new image with Kokkos 3.7, please amend last commit and push to trigger the CI pipeline.

nevermind, it was re-trigered

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
cmake/DetectOptions.cmake Show resolved Hide resolved
examples/CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
examples/CMakeLists.txt Outdated Show resolved Hide resolved
@anagainaru anagainaru force-pushed the gpu-reorg-kokkos branch 3 times, most recently from 4aaa5f2 to ef37b5e Compare February 18, 2023 16:17
@anagainaru
Copy link
Contributor Author

@vicentebolea I think this is ready to merge, let me know if you think otherwise

@anagainaru anagainaru merged commit a9ad351 into ornladios:release_29 Feb 21, 2023
vicentebolea pushed a commit to vicentebolea/ADIOS2 that referenced this pull request Feb 21, 2023
@anagainaru anagainaru deleted the gpu-reorg-kokkos branch March 2, 2023 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants