Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more optimal distribution of tiles in cubed-sphere grid tile space #41

Merged
merged 7 commits into from
May 20, 2024

Conversation

weiyuan-jiang
Copy link
Contributor

@weiyuan-jiang weiyuan-jiang commented May 3, 2024

PR addresses a situation where too many processors are requested for a cube-sphere tile space simulation. In this case, a process may end up with a domain that consists of a stripe that is only one grid cell wide, which causes the re-gridder to fail.
The PR ensures that each process has a stripe that is at least 2 grid cells wide.

The PR further improves the distribution of tiles onto processors.

IMPORTANT: The PR is not zero-diff for data assimilation in cube-sphere tile space. This is because the GLOBALCS/assim test case has never been zero-diff when the layout changed.

Tests are also non-zero-diff for the AGGGLOBALCS/model case, but 0-diff is not required/expected with layout changes when using "aggressive" compilation.

@weiyuan-jiang weiyuan-jiang requested a review from a team as a code owner May 3, 2024 14:36
@gmao-rreichle gmao-rreichle added the performance Improve performance (speed-up, optimization, scaling) label May 6, 2024
@gmao-rreichle gmao-rreichle changed the title more optimal distribution of cubed-sphere grid more optimal distribution of tiles in cubed-sphere grid tile space May 6, 2024
@biljanaorescanin
Copy link
Collaborator

Nightly Tests summary:
globalcs assim comparison failed.
aggglobalcs both model and assim run comparison failed.
gnuglobalcs assim comparison failed.

**These are in slurm **
conus.model.slurm:ntasks_model: 40
global.assim.slurm:ntasks_model: 120
globalcnclm45.model.slurm:ntasks_model: 120
globalcnclm4.model.slurm:ntasks_model: 120
globalcs.assim.slurm:ntasks_model: 120
globalcscn.model.slurm:ntasks_model: 120
globalcs.model.slurm:ntasks_model: 120
global.model.slurm:ntasks_model: 80

and ntasks-per-node is 40

Copy link
Collaborator

@gmao-rreichle gmao-rreichle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weiyuan-jiang: After I just approved the PR I noticed one more thing: It looks like elemental function rms_cs() isn't used. If so, we should delete it. Can you please take a quick look.

@gmao-rreichle gmao-rreichle merged commit 82b8ee7 into develop May 20, 2024
7 checks passed
@gmao-rreichle gmao-rreichle deleted the feature/wjiang/optimal_cs branch May 20, 2024 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix Not 0-diff performance Improve performance (speed-up, optimization, scaling)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants