Reuse MPI contexts across different mpi calls #552
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
At every MPI call we were creating a new
ContextWrapper
instance. This, albeit not really necessary, wasn't harming performance much. However, iniside the constructor, we were callingMpiWorldRegistry::getOrInitialiseWorld(int worldId)
which acquires a lock to access the MPI world.The mentioned method in the world registry is meant to be called upon world creation time from each different rank, not every time. There is a lock-free method,
MpiWorldRegistry::getWorld(int worldId)
that can be used when we know the world has been created (i.e. after we have calledMPI_Init
). Given that all MPI calls, in every thread, must happen afterMPI_Init
, it is safe to assume that the world will already exist.In this PR I change the way we query for a world in the
ContextWrapper
and, while at it, re-use context wrappers per MPI rank. Preliminary results in high-contention environments show a 50% execution time reduction.