Replace std::vector<RequestType> by RequestType in consensus algorithms. Same for AnswerType. #13722

bangerth · 2022-05-11T23:12:22Z

As written right now, the consensus algorithms take a std::vector<RequestType>/std::vector<AnswerType> and transmit this data to other processes. However, they do this on a byte-by-byte basis, not using the MPI type mechanism. This is conceptually not great since (i) it doesn't ensure that MPI converts data types when transmitting from systems with different endianness (not generally a problem, nobody does work on systems that are heterogeneous in this way), but more importantly (ii) it essentially assumes that the bit representation of a type is all that matters. The implementation does not currently have a way to ensure that that is true, and as a consequence you can currently send things such as std::vector<std::string> or some such without an error -- but all you get at the other end is the same bit representation of the std::string object, which is not generally what you want and will contain pointers that point to random places in memory.

This patch does two things:

It calls Utilities::pack/unpack() on the data so sent.
While there, it replaces std::vector<RequestType>/std::vector<AnswerType> by RequestType/AnswerType, that is, one can now exchange all data types via the consensus algorithms interface, not just std::vector objects of scalar objects.

There are a couple of places in the library where the send/receive functions currently already pack non-vector objects into std::vector<char> arrays. This means that in those places, we currently pack/unpack twice. I will fix this in a follow-up patch, but wanted to get the basic functionality out as a stand-alone patch.

I will add that there is a cost associated with packing/unpacking. I believe that that cost is acceptable given the more flexible interface.

Part of #13208.

/rebuild

Same for AnswerType.

bangerth · 2022-05-11T23:13:20Z

include/deal.II/base/mpi_compute_index_owner_internal.h

-              std::pair<types::global_dof_index, types::global_dof_index>,
-              unsigned int>
+              std::vector<
+                std::pair<types::global_dof_index, types::global_dof_index>>,
+              std::vector<unsigned int>>


Previously, the CA algorithms took as template arguments the element types of vectors. Now they just take the overall type as template argument, which means that it is now a vector-of-something.

bangerth · 2022-05-11T23:14:38Z

include/deal.II/base/mpi_consensus_algorithms.templates.h

-              send_buffer       = (create_request ? create_request(rank) :
-                                                    std::vector<RequestType>());
+              send_buffer =
+                (create_request ? Utilities::pack(create_request(rank)) :
+                                  std::vector<char>());

              // Post a request to send data
              auto ierr = MPI_Isend(send_buffer.data(),
-                                    send_buffer.size() * sizeof(RequestType),
-                                    MPI_BYTE,
+                                    send_buffer.size(),
+                                    MPI_CHAR,


This is one of the places where we previously sent stuff around as uninterpreted bytes. Now we properly pack up whatever object we get, and then send it around as MPI_CHAR. It is unpacked again below.

marcfehling · 2022-05-12T01:15:54Z

The OSX parallel 64bit worker complains:

In file included from /Users/runner/work/dealii/dealii/source/matrix_free/vector_data_exchange.cc:17:
In file included from /Users/runner/work/dealii/dealii/include/deal.II/base/mpi.h:22:
In file included from /Users/runner/work/dealii/dealii/include/deal.II/base/index_set.h:23:
In file included from /Users/runner/work/dealii/dealii/include/deal.II/base/utilities.h:45:
In file included from /Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/archive/binary_iarchive.hpp:20:
In file included from /Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/archive/binary_iarchive_impl.hpp:20:
In file included from /Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/archive/basic_binary_iprimitive.hpp:53:
In file included from /Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/serialization/array_wrapper.hpp:19:
In file included from /Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/serialization/nvp.hpp:26:
In file included from /Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/serialization/split_member.hpp:23:
/Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/serialization/access.hpp:116:11: error: no member named 'serialize' in 'std::pair<unsigned long long, unsigned long long>'
        t.serialize(ar, file_version);
        ~ ^
/Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/serialization/serialization.hpp:68:13: note: in instantiation of function template specialization 'boost::serialization::access::serialize<boost::archive::binary_oarchive, std::pair<unsigned long long, unsigned long long>>' requested here
    access::serialize(ar, t, static_cast<unsigned int>(file_version));
            ^
/Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/serialization/serialization.hpp:126:5: note: in instantiation of function template specialization 'boost::serialization::serialize<boost::archive::binary_oarchive, std::pair<unsigned long long, unsigned long long>>' requested here
    serialize(ar, t, v);
    ^
/Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/archive/detail/oserializer.hpp:153:27: note: in instantiation of function template specialization 'boost::serialization::serialize_adl<boost::archive::binary_oarchive, std::pair<unsigned long long, unsigned long long>>' requested here
    boost::serialization::serialize_adl(
                          ^
/Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/serialization/singleton.hpp:147:5: note: in instantiation of member function 'boost::archive::detail::oserializer<boost::archive::binary_oarchive, std::pair<unsigned long long, unsigned long long>>::save_object_data' requested here
    singleton_wrapper(){
    ^
/Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/serialization/singleton.hpp:171:47: note: in instantiation of member function 'boost::serialization::detail::singleton_wrapper<boost::archive::detail::oserializer<boost::archive::binary_oarchive, std::pair<unsigned long long, unsigned long long>>>::singleton_wrapper' requested here
        static detail::singleton_wrapper< T > t;
                                              ^
/Users/runner/work/dealii/dealii/bundled/boost-1.70.0/include/boost/serialization/singleton.hpp:196:16: note: (skipping 43 contexts in backtrace; use -ftemplate-backtrace-limit=0 to see all)
        return get_instance();
               ^
/Users/runner/work/dealii/dealii/include/deal.II/base/utilities.h:1274:5: note: in instantiation of function template specialization 'dealii::Utilities::pack<std::vector<std::pair<unsigned long long, unsigned long long>>>' requested here
    pack<T>(object, buffer, allow_compression);
    ^
/Users/runner/work/dealii/dealii/include/deal.II/base/mpi_consensus_algorithms.templates.h:265:46: note: in instantiation of function template specialization 'dealii::Utilities::pack<std::vector<std::pair<unsigned long long, unsigned long long>>>' requested here
                (create_request ? Utilities::pack(create_request(rank)) :
                                             ^
/Users/runner/work/dealii/dealii/include/deal.II/base/mpi_consensus_algorithms.templates.h:199:9: note: in instantiation of member function 'dealii::Utilities::MPI::ConsensusAlgorithms::NBX<std::vector<std::pair<unsigned long long, unsigned long long>>, std::vector<unsigned int>>::start_communication' requested here
        start_communication(targets, create_request, comm);
        ^
/Users/runner/work/dealii/dealii/include/deal.II/base/mpi_consensus_algorithms.templates.h:900:36: note: in instantiation of member function 'dealii::Utilities::MPI::ConsensusAlgorithms::NBX<std::vector<std::pair<unsigned long long, unsigned long long>>, std::vector<unsigned int>>::run' requested here
          consensus_algo.reset(new NBX<RequestType, AnswerType>());
                                   ^
/Users/runner/work/dealii/dealii/source/matrix_free/vector_data_exchange.cc:447:11: note: in instantiation of member function 'dealii::Utilities::MPI::ConsensusAlgorithms::Selector<std::vector<std::pair<unsigned long long, unsigned long long>>, std::vector<unsigned int>>::run' requested here
          consensus_algorithm(process, comm);
          ^

masterleinad

Would you have any data on the time pack/unpack takes compared to typical costs for using the consensus algorithms?

bangerth · 2022-05-12T16:14:06Z

None beyond what @kronbichler found out a while ago, namely that we shouldn't compress data when packing. (Which I had forgotten to disable -- updated patch now pushed.) I believe I remember that @kronbichler said that the packing was not measurable at the time, but I don't recall the details.

I have plans to address this in a generic way at some later point by writing Isend and Irecv functions that take general arguments but that don't pack/unpack if it's a std::vector. I want to do that because it would avoid this exact conversation altogether. My primary motivation with this patch is to get the interfaces in place given all of the work earlier this year on interfaces.

kronbichler

I think the code looks nicer this way. I guess we cannot be completely sure that the cost is really negligible, but it basically boils down to a memcpy for those simple types, so it is likely very small for typical use cases. Do you agree @peterrum? If we really wanted we could check the step-37 benchmark. Some setup routines are quite heavy on this function, so if nothing is noticeable there (which is what I believe), we should be fine.

bangerth · 2022-05-12T20:48:26Z

Just for completeness, I plan (after the release) to work on functions such as

  // general template
  template <typename T>
  std::future<void>
  Isend (const T &t, ...);

  // specialization
  template <typename T,
                   typename = std::enable_if_t<is_mpi_datatype<T>::value, void>
  std::future<void>
  Isend (const std::vector<T> &vector_of_t, ...);

The first would call Utilities::pack(), the second would not. There would then be corresponding functions for Irecv. By using these signatures in the consensus algorithm (and other places), we can send data without packing where possible, and pack where necessary.

This patch here simply makes this possible, at the cost of introducing the packing for now.

tamiko · 2022-05-12T22:03:55Z

@kronbichler Our performance tests are happily running on our cluster (I have already accumulated about 1000 artifacts or so). What about we merge and I have a look at how step 37 is doing?

kronbichler · 2022-05-13T06:44:29Z

Yes, I agree. Let @peterrum make the final call.

peterrum

The change makes sense to be able to support more types.

@bangerth What other incompatible changes are you planing? This PR breaks a lot of my projects.

bangerth · 2022-05-13T20:10:21Z

This is the last change I have that is incompatible.

bangerth · 2022-05-13T20:16:21Z

The only other thing I want to do is to add the signature of a function that does not require an answer. I will for the moment implement it by just calling the other function, with more efficient implementations coming later.

Replace std::vector<RequestType> by RequestType.

87ad5f9

Same for AnswerType.

bangerth added the ready to test label May 11, 2022

bangerth commented May 11, 2022

View reviewed changes

Add a necessary include.

0139f45

masterleinad reviewed May 12, 2022

View reviewed changes

Do not compress data.

b247702

kronbichler approved these changes May 12, 2022

View reviewed changes

masterleinad approved these changes May 12, 2022

View reviewed changes

tamiko approved these changes May 12, 2022

View reviewed changes

peterrum approved these changes May 13, 2022

View reviewed changes

masterleinad merged commit c374906 into dealii:master May 13, 2022

bangerth deleted the ca-type-2 branch May 13, 2022 20:10

This was referenced May 13, 2022

Simplify the type used by ConsensusAlgorithm in compute_n_point_to_point_communications(). #13731

Merged

Avoid double packing/unpacking for CA algorithms. #13733

Merged

This was referenced May 14, 2022

Update deal.II hpsint/hpsint#111

Merged

Update deal.II hyperdeal/hyperdeal#99

Merged

Update deal.II exadg/exadg#232

Merged

kronbichler mentioned this pull request May 18, 2022

Fix compute_index_owner_01. #13752

Merged

bangerth mentioned this pull request May 20, 2022

Consensus Algorithms: Move template functions into .h file. #13768

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace std::vector<RequestType> by RequestType in consensus algorithms. Same for AnswerType. #13722

Replace std::vector<RequestType> by RequestType in consensus algorithms. Same for AnswerType. #13722

bangerth commented May 11, 2022

bangerth May 11, 2022

bangerth May 11, 2022

marcfehling commented May 12, 2022

masterleinad left a comment

bangerth commented May 12, 2022

kronbichler left a comment

bangerth commented May 12, 2022

tamiko commented May 12, 2022 •

edited

kronbichler commented May 13, 2022

peterrum left a comment

bangerth commented May 13, 2022

bangerth commented May 13, 2022

Replace std::vector<RequestType> by RequestType in consensus algorithms. Same for AnswerType. #13722

Replace std::vector<RequestType> by RequestType in consensus algorithms. Same for AnswerType. #13722

Conversation

bangerth commented May 11, 2022

bangerth May 11, 2022

Choose a reason for hiding this comment

bangerth May 11, 2022

Choose a reason for hiding this comment

marcfehling commented May 12, 2022

masterleinad left a comment

Choose a reason for hiding this comment

bangerth commented May 12, 2022

kronbichler left a comment

Choose a reason for hiding this comment

bangerth commented May 12, 2022

tamiko commented May 12, 2022 • edited

kronbichler commented May 13, 2022

peterrum left a comment

Choose a reason for hiding this comment

bangerth commented May 13, 2022

bangerth commented May 13, 2022

tamiko commented May 12, 2022 •

edited