Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert device-aware MPI implementation for LinearAlgebra::distributed::Vector #14571

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,7 @@ jobs:
-D DEAL_II_WITH_KOKKOS="ON" \
-D KOKKOS_DIR=${GITHUB_WORKSPACE}/../kokkos-install \
-D DEAL_II_WITH_MPI="ON" \
-D DEAL_II_MPI_WITH_DEVICE_SUPPORT="ON" \
masterleinad marked this conversation as resolved.
Show resolved Hide resolved
-D DEAL_II_WITH_P4EST="ON" \
-D DEAL_II_COMPONENT_EXAMPLES="ON" \
..
Expand Down
10 changes: 10 additions & 0 deletions cmake/config/template-arguments.in
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,16 @@ MPI_SCALARS := { int;
@DEAL_II_EXPAND_COMPLEX_SCALARS@;
}

// complex types and long double are typically not directly supported on GPUs
MPI_DEVICE_SCALARS := { int;
long int;
unsigned int;
unsigned long int;
unsigned long long int;
float;
double;
}

// template names for serial vectors that we can instantiate as T<S> where
// S=REAL_SCALARS for example
DEAL_II_VEC_TEMPLATES := { Vector; BlockVector }
Expand Down
14 changes: 9 additions & 5 deletions cmake/configure/configure_10_mpi.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,12 @@ macro(feature_mpi_configure_external)
# (in Modules/FindMPI.cmake) at some point. For the time being this is an
# advanced configuration option.
#
option(DEAL_II_MPI_WITH_CUDA_SUPPORT "Enable MPI Cuda support" OFF)
mark_as_advanced(DEAL_II_MPI_WITH_CUDA_SUPPORT)
if(DEAL_II_MPI_WITH_CUDA_SUPPORT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does DEAL_II_MPI_WITH_CUDA_SUPPORT get set? AFAICT, based on the changes to cuda.html, we would not expect this to be set by a user any more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only for backward-compatibility.

option(DEAL_II_MPI_WITH_DEVICE_SUPPORT "Enable MPI Device support" ON)
else()
option(DEAL_II_MPI_WITH_DEVICE_SUPPORT "Enable MPI Device support" OFF)
endif()
mark_as_advanced(DEAL_II_MPI_WITH_DEVICE_SUPPORT)
endmacro()

macro(feature_mpi_error_message)
Expand All @@ -90,8 +94,8 @@ configure_feature(MPI)

if(NOT DEAL_II_WITH_MPI)
#
# Disable and hide the DEAL_II_MPI_WITH_CUDA_SUPPORT option
# Disable and hide the DEAL_II_MPI_WITH_DEVICE_SUPPORT option
#
set(DEAL_II_MPI_WITH_CUDA_SUPPORT)
unset(DEAL_II_MPI_WITH_CUDA_SUPPORT CACHE)
set(DEAL_II_MPI_WITH_DEVICE_SUPPORT)
unset(DEAL_II_MPI_WITH_DEVICE_SUPPORT CACHE)
endif()
2 changes: 1 addition & 1 deletion doc/doxygen/options.dox.in
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ PREDEFINED = DOXYGEN=1 \
DEAL_II_LAPACK_WITH_MKL=1 \
DEAL_II_WITH_METIS=1 \
DEAL_II_WITH_MPI=1 \
DEAL_II_MPI_WITH_CUDA_SUPPORT=1 \
DEAL_II_MPI_WITH_DEVICE_SUPPORT=1 \
DEAL_II_MPI_VERSION_MAJOR=3 \
DEAL_II_MPI_VERSION_MINOR=0 \
DEAL_II_WITH_MUPARSER=1 \
Expand Down
2 changes: 1 addition & 1 deletion doc/external-libs/cuda.html
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ <h1>Installing deal.II with CUDA</h1>

-DDEAL_II_WITH_CUDA=ON
-DDEAL_II_WITH_MPI=ON
-DDEAL_II_MPI_WITH_CUDA_SUPPORT=ON
-DDEAL_II_MPI_WITH_DEVICE_SUPPORT=ON
</pre>
Note, that there is no check that detects if the MPI implementation
really is CUDA-aware. Activating this flag for incompatible MPI libraries
Expand Down
4 changes: 2 additions & 2 deletions examples/step-64/step-64.cc
Original file line number Diff line number Diff line change
Expand Up @@ -358,8 +358,8 @@ namespace Step64
// memory space to use. There is also LinearAlgebra::CUDAWrappers::Vector
// that always uses GPU memory storage but doesn't work with MPI. It might
// be worth noticing that the communication between different MPI processes
// can be improved if the MPI implementation is CUDA-aware and the configure
// flag `DEAL_II_MPI_WITH_CUDA_SUPPORT` is enabled. (The value of this
// can be improved if the MPI implementation is GPU-aware and the configure
// flag `DEAL_II_MPI_WITH_DEVICE_SUPPORT` is enabled. (The value of this
// flag needs to be set at the time you call `cmake` when installing
// deal.II.)
//
Expand Down
3 changes: 3 additions & 0 deletions include/deal.II/base/config.h.in
Original file line number Diff line number Diff line change
Expand Up @@ -448,7 +448,10 @@
# define DEAL_II_MPI_VERSION_GTE(major,minor) false
#endif

#cmakedefine DEAL_II_MPI_WITH_DEVICE_SUPPORT
#ifdef DEAL_II_MPI_WITH_DEVICE_SUPPORT
#cmakedefine DEAL_II_MPI_WITH_CUDA_SUPPORT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this now just for backwards compatibility? I don't see anywhere in the library where it is used any more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only for backward-compatibility.

#endif

/***********************************************************************
* Two macro names that we put at the top and bottom of all deal.II files
Expand Down
12 changes: 5 additions & 7 deletions include/deal.II/base/partitioner.h
Original file line number Diff line number Diff line change
Expand Up @@ -671,7 +671,7 @@ namespace Utilities
private:
/**
* Initialize import_indices_plain_dev from import_indices_data. This
* function is only used when using CUDA-aware MPI.
* function is only used when using device-aware MPI.
*/
void
initialize_import_indices_plain_dev() const;
Expand Down Expand Up @@ -722,15 +722,13 @@ namespace Utilities
/**
* The set of (local) indices that we are importing during compress(),
* i.e., others' ghosts that belong to the local range. The data stored is
* the same than in import_indices_data but the data is expanded in plain
* arrays. This variable is only used when using CUDA-aware MPI.
* the same as in import_indices_data but the data is expanded in plain
* arrays. This variable is only used when using device-aware MPI.
*/
// The variable is mutable to enable lazy initialization in
// export_to_ghosted_array_start(). This way partitioner does not have to
// be templated on the MemorySpaceType.
// export_to_ghosted_array_start().
mutable std::vector<
std::pair<std::unique_ptr<unsigned int[], void (*)(unsigned int *)>,
unsigned int>>
Kokkos::View<unsigned int *, MemorySpace::Default::kokkos_space>>
import_indices_plain_dev;

/**
Expand Down