adapt block size in save_parallel #7268
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Not all processes have to be involved actively in saving the content of a matrix.
Scenario: A
2*10^9 x 400
matrix is distributed on a80 x 5
process grid using block size 32. Previously, the matrix is copied to a matrix distributed on a1 x 400
process grid with a row block size of2*10^9
and a column block size of1
. Therefore, all processes are involved in saving the matrix. This behavior causes a considerable amount of communication for this scenario and makes saving the matrix very expensive.With the change in this commit, only
ceil(400/32)=13
processes will be actively involved in saving the matrix and the amount of communication is reduced.Note: The behavior for use cases, in which both matrix dimensions are much larger than the block sizes, remains unchanged.