[WIP] Add LinearAlgebra::SharedMPI::Vector #10872

peterrum · 2020-09-01T15:59:31Z

LinearAlgebra::SharedMPI::Vector

This PR adds a new vector LinearAlgebra::SharedMPI::Vector class that is built around MPI 3.0 shared-memory features.

MPI 3.0 provides functions to allocate memory for an array with the following command:

MPI_Win_allocate_shared(size, sizeof(Number), info, comm_shared, &data_this, win);

and to query a pointer to the beginning of the array owned by processes on the shared-memory domain:

MPI_Win_shared_query(*win, i, &ssize, &disp_unit, &data.others[i]);

The command to create a communicator comm_shared is:

MPI_Comm_split_type(
        comm, MPI_COMM_TYPE_SHARED, rank, MPI_INFO_NULL, &comm_shared);

These functions allow us to access all vector entries on the same compute node (same shared-memory domain) also in a purely MPI-parallelized program.

For more implementation details, see Section 4 and Appendix A of the pre-print of the hyper.deal release paper (https://arxiv.org/abs/2002.08110).

LinearAlgebra::SharedMPI::PartitionerBase

A second feature of the new vector class is that users can pass their own partitioner implementation. The first implementation LinearAlgebra::SharedMPI::Partitioner is just like Utilities::MPI::Partitioner (built around IndexSet) but splits up (internally) export_to_ghost_array and import_from_ghost_array in two steps: one for remote data (which requires MPI_Send/MPI_Recv) and one for shared data (which copied via memcpy if buffering is requested).

The interface should look familiar to that of Utilities::MPI::Partitioner with the difference that we are working with vectors of pointers (which are returned by MPI_Win_shared_query).

In hyper.deal, we are using our own partitioner implementation, which is specially tailored for DG.

Motivation

In hyper.deal, we reached a speed up of 25-30% on a single compute node (with ECL) by not using MPI_Send/MPI_Recv and in CEED BP1, a speed up up to 10% for the operator evaluation. Our hope is to reach better results with DG.

Next steps

In the next step, we will extend the support of MatrixFree for the new vector class, in particular, so that we can use it for DG. Furthermore, we would like to investigate partitioner strategies within MatrixFree, which requires some internal changes - so let's keep that out of this PR!

Note: I still need to clean up the code and add some documentation. Nevertheless, a preliminary feedback would be push appretated! Maybe someone has suggestions how to integrate the new classes better in the existing deal.II concepts.

Related to hyperdeal/hyperdeal#18 and kronbichler/ceed_benchmarks_dealii#7.

…rySpace::MemorySpaceData and use TBBPartitioner

kronbichler

I have collected some random comments. I agree with the basic concepts because I have been discussing them with @peterrum before, but it would be good with some additional comments.

include/deal.II/lac/la_sm_partitioner.h

kronbichler · 2020-10-02T18:26:14Z

include/deal.II/lac/la_sm_partitioner.h

+namespace LinearAlgebra
+{
+  namespace SharedMPI
+  {
+    /**
+     * Partitioner base class to be used in context of


The namespaces are a bit of a mess; the generic LA::dist::Vector has its partitioner in base; maybe we should move that here or this one there?

What do you suggest? To move all the partitioners here?

I guess we can leave things here for the moment until someone feels a need to clean up things. I really like having the partitioner file to be named similarly and in the same place. For the basic partitioner that ship has sailed (well, we can always deprecate things, but I don't think it is worth it).

include/deal.II/lac/la_sm_partitioner.h

include/deal.II/lac/la_sm_vector.h

kronbichler · 2020-10-02T18:56:39Z

include/deal.II/lac/la_sm_vector.h

+      /**
+       * Check whether the given partitioner is compatible with the
+       * partitioner used for this vector.
+       *
+       * @note Not implemented yet.
+       */
+      bool
+      partitioners_are_compatible(
+        const Utilities::MPI::Partitioner &part) const;
+
+      /**
+       * Check whether the given partitioner is compatible with the
+       * partitioner used for this vector.
+       *
+       * @note Not implemented yet.
+       */
+      bool
+      partitioners_are_globally_compatible(
+        const Utilities::MPI::Partitioner &part) const;


Is this needed at all?

Yup. MatrixFree will complain else (if I remember correctly).

The class does some checks depending on where the type traits of the vector send down the evaluator. I do not feel strongly here, so we can leave things in this state for the moment until we have all members filled.

kronbichler · 2020-10-02T18:59:15Z

include/deal.II/lac/la_sm_vector.templates.h

+            Utilities::MPI::this_mpi_process(comm_shared);
+
+          MPI_Win *win       = new MPI_Win;
+          Number * data_this = (Number *)malloc(0);


This is also a question mark because I saw memory leaks on some MPI implementations.

I have fixed this. According to the source code of Open MPI

*((void**) baseptr) = base;

see also https://github.com/open-mpi/ompi/blob/6c46da32454553a52c6b0c30cae8d0075c43cd94/ompi/win/win.c#L323, the function MPI_Win_allocate_shared expects a void**, i.e, in our case &data_this...

include/deal.II/lac/la_sm_vector.templates.h

include/deal.II/matrix_free/matrix_free.h

peterrum · 2020-10-03T21:16:07Z

@kronbichler I have addressed many of your comments. There are some where I am not sure how to proceed.

kronbichler

I think this is a good step forward. Of course, we still need to fill interfaces for the vector before we can use it more generally, but I would be fine to leave that to another PR (if we manage to finalize things during the next month). It would be great if someone else can give it a look as well.

kronbichler · 2020-10-07T11:45:28Z

/rebuild

tjhei · 2020-10-07T12:45:50Z

I have some fundamental questions (without having looked at the code very closely):

To me this seems like an optimization of the existing parallel vector (to grant "local" access to entries on the same node). Is there anything else that is new/different?
Why would one not want to use this over the existing parallel vector?
If there is no good reason to do so in 2), why is this not an extension to the existing vector class? This is not particularly user-friendly to introduce another new vector type. Maybe this can be a setting/policy/compile time setting?
Does this mean that we need a lot of new template instantiations for the new vector type? That would worry me.
I am not sure I like the name "SharedMPI" (not that I have a better idea, but it reminds me of shared::Tria).

peterrum · 2020-10-08T06:24:43Z

To me this seems like an optimization of the existing parallel vector (to grant "local" access to entries on the same node). Is there anything else that is new/different?

I had the option to introduce 1) a new MemorySpace-type or 2) a new modus/setting of L:d:V or 3) a new vector.

The first two options would lead to the fact that L:d:V would have per default information regarding shared memory: 2 communicator, functions which would only work under certain circumstances...

In my opinion, the cleanest approach is to introduce a vector, since it is different in its core. For instance, L:d:V is create with IndexSets (which is transformed into a Partitioner) and such the vector has a global view on the data. However, the new vector is much simpler: it is only a container of data of a specific size (and such data can be only accessed with local indices). The interpretation of the data is completely handled by a new set of (potentially user-provided) partitioners. For instance, this PR introduces a partitioner which is built around IndexSets. But in hyper.deal we have a tailored partitioner for DG (once this PR is accepted and merged, we will create a follow-up PR to move it here).

Why would one not want to use this over the existing parallel vector?

One would need to work with a second communicator and potentially deal with race conditionally in user code.

If there is no good reason to do so in 2), why is this not an extension to the existing vector class? This is not particularly user-friendly to introduce another new vector type. Maybe this can be a setting/policy/compile time setting?

setting: this is more than a setting: the user needs to provide a second communicator and a partitioner (depending on the setting either one of two partitioners (old/new) and/or different methods are used)
the policy is the partitioner, but, I don't think it is a good idea to give the old Utilities::MPI::Partitioner the same interface as the new one (which is specialized for the case that you have pointers to the data of other processes).
compile-time setting has the same problems as the first point

Does this mean that we need a lot of new template instantiations for the new vector type? That would worry me.

The major benefit of this class will be in performance-critical code paths, i.e., MatrixFree: we are working on a native support for this vector class also for DG in MatrixFree, where we expect the largest benefit. So one could use MatrixFree to setup the vectors or create a L:d:V and/or copy over the content once from a L:d:V, without the need to instantiate all the functions. The first approach works very well (in hyper.deal) and we should probably create a set of new function in MatrixFreeTools (since probably all users of MatrixFree have implemented their own routines for computing norms, intepolating/projection some functions onto a vector, computing the right-hand side vector with MatrixFree).

I am not sure I like the name "SharedMPI" (not that I have a better idea, but it reminds me of shared::Tria).

That p:s:T has a misleading and unfortunate name should not be a reason to dismiss the name SharedMPI: p:s:T is not shared but simply replicated on all processes, which lead to some confusion until I had a look at the implementation. The new vector is actually shared among processes via shared memory (provided by MPI).

tjhei

It would be great if you could write documentation for the new vector type. That would have helped with my misunderstanding what this class is trying to do.

Also, we only require MPI 2 so some of this needs to be guarded by ifdefs, right?

tjhei · 2020-10-08T14:22:45Z

include/deal.II/lac/la_sm_vector.h

+
+namespace LinearAlgebra
+{
+  namespace SharedMPI


namespace documentation?

tjhei · 2020-10-08T14:23:22Z

include/deal.II/lac/la_sm_vector.h

+
+
+    template <typename Number, typename MemorySpace = MemorySpace::Host>
+    class Vector : public ::dealii::LinearAlgebra::VectorSpaceVector<Number>,


uh, no documentation for the new class? Can you fix that please?

tjhei · 2020-10-08T14:23:34Z

include/deal.II/lac/la_sm_vector.h

+       * Get pointers to the beginning of the values of the other
+       * processes of the same shared-memory domain.
+       *
+       * TODO: name of the function?


tjhei · 2020-10-08T14:29:18Z

One would need to work with a second communicator and potentially deal with race conditionally in user code.

So maybe I misunderstood what this vector class can do. Are you saying that this is to be used only within a node (using a separate communicator) and there is no MPI/communication capabilities? An overview about what this class does would be great to have as class documentation.

peterrum · 2020-11-05T08:35:31Z

I am closing this PR since SM feature has been introduced into L:p:V in a sequence of PRs beginning from PR #11074.

This was referenced Sep 1, 2020

Make MemorySpaceData virtual #10873

Merged

Move implementation of partitioner into .cc file hyperdeal/hyperdeal#43

Open

Small fixes in the documentation of L:d:V #10878

Merged

peterrum force-pushed the hyperdeal_lsv branch from 6658d3a to f6e1393 Compare September 3, 2020 09:06

kronbichler added the Linear Algebra label Sep 3, 2020

peterrum added the ready for review label Sep 14, 2020

tamiko changed the title ~~Add LinearAlgebra::SharedMPI::Vector [WIP]~~ Add LinearAlgebra::SharedMPI::Vector Sep 22, 2020

bangerth mentioned this pull request Sep 30, 2020

MemorySpaceData questions #10986

Closed

peterrum added 22 commits October 1, 2020 11:16

Copy vector

e67eda2

Ignore complex numbers

e2ca894

Remove implementation and comments

265ce53

Add test and MatrixFree::initialize_dof_vector

e1fd276

Add MemorySpaceData and start to implement resize_val

095c4a0

Add empty partitioner

a6a5ddc

Use MPI in tests

28c7b4e

Implement initialization

c83f1f2

Add update_ghost_values and compress

021c2f3

Add assert

692a2a5

Add test

dbc0684

Fill many functions

7208545

Take care of alignment (by 64 bytes)

3f6daf8

Fix assert

c10ac80

Fix segfault

954941f

Fix setting of update_ghost_values

cb60957

Fix compression (zero out ghosts in one go)

d957080

Inherit LinearAlgebra::SharedMPI::MemorySpaceData from ::dealii::Memo…

668a7cb

…rySpace::MemorySpaceData and use TBBPartitioner

Remove template argument from LinearAlgebra::SharedMPI::Partitioner

2f145f1

Use dealii partitioner at less places

8382d57

Rename variables

69470e0

Test more

62f3365

peterrum added 10 commits October 1, 2020 11:22

Remove functions

2204db7

Remove function specialization

ad40b4e

Add comments

05a5c4d

Remove unsused reinit functions

330c343

Add comments

fd49117

Clear requests

ccd057b

Add comments

7356216

Fix typo and work on warning

074eb22

Fix warning

f328b59

Wokr on warnings

c7184a3

kronbichler reviewed Oct 2, 2020

View reviewed changes

peterrum added 2 commits October 3, 2020 23:13

Fix

e457c44

Review

067a436

peterrum force-pushed the hyperdeal_lsv branch from 6aa208a to 067a436 Compare October 3, 2020 21:15

kronbichler added the ready to test label Oct 7, 2020

kronbichler approved these changes Oct 7, 2020

View reviewed changes

tjhei reviewed Oct 8, 2020

View reviewed changes

peterrum mentioned this pull request Oct 15, 2020

LinearAlgebra::SharedMPI::Vector peterrum/dealii#28

Closed

peterrum changed the title ~~Add LinearAlgebra::SharedMPI::Vector~~ [WIP] Add LinearAlgebra::SharedMPI::Vector Oct 16, 2020

peterrum added the Do not merge ☠️ label Oct 16, 2020

This was referenced Oct 18, 2020

[WIP] Enable MPI-3 capabilities in MatrixFree #11053

Closed

Introduce MPI-3 shared-memory capabilities in L:d:V #11074

Merged

peterrum added this to In progress in MPI-3 shared memory Oct 23, 2020

peterrum closed this Nov 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add LinearAlgebra::SharedMPI::Vector #10872

[WIP] Add LinearAlgebra::SharedMPI::Vector #10872

peterrum commented Sep 1, 2020 •

edited

kronbichler left a comment

kronbichler Oct 2, 2020

peterrum Oct 2, 2020

kronbichler Oct 7, 2020

kronbichler Oct 2, 2020

peterrum Oct 2, 2020

kronbichler Oct 7, 2020

kronbichler Oct 2, 2020

peterrum Oct 3, 2020

peterrum commented Oct 3, 2020

kronbichler left a comment

kronbichler commented Oct 7, 2020

tjhei commented Oct 7, 2020

peterrum commented Oct 8, 2020

tjhei left a comment

tjhei Oct 8, 2020

tjhei Oct 8, 2020

tjhei Oct 8, 2020

tjhei commented Oct 8, 2020

peterrum commented Nov 5, 2020



		template <typename Number, typename MemorySpace = MemorySpace::Host>
		class Vector : public ::dealii::LinearAlgebra::VectorSpaceVector<Number>,

[WIP] Add LinearAlgebra::SharedMPI::Vector #10872

[WIP] Add LinearAlgebra::SharedMPI::Vector #10872

Conversation

peterrum commented Sep 1, 2020 • edited

LinearAlgebra::SharedMPI::Vector

LinearAlgebra::SharedMPI::PartitionerBase

Motivation

Next steps

kronbichler left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peterrum commented Oct 3, 2020

kronbichler left a comment

Choose a reason for hiding this comment

kronbichler commented Oct 7, 2020

tjhei commented Oct 7, 2020

peterrum commented Oct 8, 2020

tjhei left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tjhei commented Oct 8, 2020

peterrum commented Nov 5, 2020

peterrum commented Sep 1, 2020 •

edited