-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add LinearAlgebra::SharedMPI::Vector #10872
Conversation
6658d3a
to
f6e1393
Compare
…rySpace::MemorySpaceData and use TBBPartitioner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have collected some random comments. I agree with the basic concepts because I have been discussing them with @peterrum before, but it would be good with some additional comments.
namespace LinearAlgebra | ||
{ | ||
namespace SharedMPI | ||
{ | ||
/** | ||
* Partitioner base class to be used in context of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The namespaces are a bit of a mess; the generic LA::dist::Vector
has its partitioner in base; maybe we should move that here or this one there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you suggest? To move all the partitioners here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can leave things here for the moment until someone feels a need to clean up things. I really like having the partitioner file to be named similarly and in the same place. For the basic partitioner that ship has sailed (well, we can always deprecate things, but I don't think it is worth it).
/** | ||
* Check whether the given partitioner is compatible with the | ||
* partitioner used for this vector. | ||
* | ||
* @note Not implemented yet. | ||
*/ | ||
bool | ||
partitioners_are_compatible( | ||
const Utilities::MPI::Partitioner &part) const; | ||
|
||
/** | ||
* Check whether the given partitioner is compatible with the | ||
* partitioner used for this vector. | ||
* | ||
* @note Not implemented yet. | ||
*/ | ||
bool | ||
partitioners_are_globally_compatible( | ||
const Utilities::MPI::Partitioner &part) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. MatrixFree
will complain else (if I remember correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class does some checks depending on where the type traits of the vector send down the evaluator. I do not feel strongly here, so we can leave things in this state for the moment until we have all members filled.
Utilities::MPI::this_mpi_process(comm_shared); | ||
|
||
MPI_Win *win = new MPI_Win; | ||
Number * data_this = (Number *)malloc(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also a question mark because I saw memory leaks on some MPI implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed this. According to the source code of Open MPI
*((void**) baseptr) = base;
see also https://github.com/open-mpi/ompi/blob/6c46da32454553a52c6b0c30cae8d0075c43cd94/ompi/win/win.c#L323, the function MPI_Win_allocate_shared
expects a void**
, i.e, in our case &data_this
...
6aa208a
to
067a436
Compare
@kronbichler I have addressed many of your comments. There are some where I am not sure how to proceed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good step forward. Of course, we still need to fill interfaces for the vector before we can use it more generally, but I would be fine to leave that to another PR (if we manage to finalize things during the next month). It would be great if someone else can give it a look as well.
/rebuild |
I have some fundamental questions (without having looked at the code very closely):
|
I had the option to introduce 1) a new The first two options would lead to the fact that In my opinion, the cleanest approach is to introduce a vector, since it is different in its core. For instance,
One would need to work with a second communicator and potentially deal with race conditionally in user code.
The major benefit of this class will be in performance-critical code paths, i.e.,
That |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if you could write documentation for the new vector type. That would have helped with my misunderstanding what this class is trying to do.
Also, we only require MPI 2 so some of this needs to be guarded by ifdefs, right?
|
||
namespace LinearAlgebra | ||
{ | ||
namespace SharedMPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
namespace documentation?
|
||
|
||
template <typename Number, typename MemorySpace = MemorySpace::Host> | ||
class Vector : public ::dealii::LinearAlgebra::VectorSpaceVector<Number>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh, no documentation for the new class? Can you fix that please?
* Get pointers to the beginning of the values of the other | ||
* processes of the same shared-memory domain. | ||
* | ||
* TODO: name of the function? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
So maybe I misunderstood what this vector class can do. Are you saying that this is to be used only within a node (using a separate communicator) and there is no MPI/communication capabilities? An overview about what this class does would be great to have as class documentation. |
I am closing this PR since SM feature has been introduced into |
LinearAlgebra::SharedMPI::Vector
This PR adds a new vector
LinearAlgebra::SharedMPI::Vector
class that is built around MPI 3.0 shared-memory features.MPI 3.0 provides functions to allocate memory for an array with the following command:
and to query a pointer to the beginning of the array owned by processes on the shared-memory domain:
MPI_Win_shared_query(*win, i, &ssize, &disp_unit, &data.others[i]);
The command to create a communicator
comm_shared
is:MPI_Comm_split_type( comm, MPI_COMM_TYPE_SHARED, rank, MPI_INFO_NULL, &comm_shared);
These functions allow us to access all vector entries on the same compute node (same shared-memory domain) also in a purely MPI-parallelized program.
For more implementation details, see Section 4 and Appendix A of the pre-print of the hyper.deal release paper (https://arxiv.org/abs/2002.08110).
LinearAlgebra::SharedMPI::PartitionerBase
A second feature of the new vector class is that users can pass their own partitioner implementation. The first implementation
LinearAlgebra::SharedMPI::Partitioner
is just likeUtilities::MPI::Partitioner
(built aroundIndexSet
) but splits up (internally)export_to_ghost_array
andimport_from_ghost_array
in two steps: one for remote data (which requiresMPI_Send/MPI_Recv
) and one for shared data (which copied viamemcpy
if buffering is requested).The interface should look familiar to that of
Utilities::MPI::Partitioner
with the difference that we are working with vectors of pointers (which are returned byMPI_Win_shared_query
).In hyper.deal, we are using our own partitioner implementation, which is specially tailored for DG.
Motivation
In hyper.deal, we reached a speed up of 25-30% on a single compute node (with ECL) by not using
MPI_Send/MPI_Recv
and in CEED BP1, a speed up up to 10% for the operator evaluation. Our hope is to reach better results with DG.Next steps
In the next step, we will extend the support of
MatrixFree
for the new vector class, in particular, so that we can use it for DG. Furthermore, we would like to investigate partitioner strategies withinMatrixFree
, which requires some internal changes - so let's keep that out of this PR!Note: I still need to clean up the code and add some documentation. Nevertheless, a preliminary feedback would be push appretated! Maybe someone has suggestions how to integrate the new classes better in the existing deal.II concepts.
Related to hyperdeal/hyperdeal#18 and kronbichler/ceed_benchmarks_dealii#7.