Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process indices in ComputeIndexOwner by intervals #8813

Merged
merged 1 commit into from
Sep 28, 2019

Conversation

kronbichler
Copy link
Member

@kronbichler kronbichler commented Sep 20, 2019

This PR builds on top of #8772, and only the last commit actually implements something new. The new functionality is:

  1. We process the owner of the dictionary section in the compute index owner code interval by interval, not dof by dof
  2. A minimal grain size has been introduced.

Fixes #8785.
Fixes #8293.

@kronbichler kronbichler changed the title Partitioner use ranges Process indices in ComputeIndexOwner by intervals Sep 20, 2019
@kronbichler
Copy link
Member Author

Just as a remark, the contribution with the grain size is quite important on many ranks. Taking the example listed in #8293 (see there for the labels) with 49k MPI ranks, here is the difference between this patch applied and not.

Before:
scaling_september

With this patch:
scaling_september2

Compare e.g. the mg transfer data point where we call Utilities::MPI::compute_index_owner() as well as set up two Utilities::MPI::Partitioner objects for each level. What happens: On level 2 we have 256 cells and, due to FE_Q<3>(5), 35301 DoFs. In the old setting, we would spread these indices onto a dictionary with 1 DoF each (and some MPI ranks do not get any index). As only 256 out of these 35k ranks actually own cells on that level, this is essentially a one-to-all communication that hurts on that scale. With the grain size, this problem disappears.

@Rombur Rombur merged commit 319db49 into dealii:master Sep 28, 2019
@kronbichler kronbichler deleted the partitioner_use_ranges branch March 16, 2020 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Speed up ComputeIndexOwner::Dictionary Code optimizations for >100k MPI ranks
3 participants