Skip to content

Fix load balancing with FEM#489

Draft
s-mayani wants to merge 5 commits intoIPPL-framework:masterfrom
s-mayani:fix_loadbalancing_fem
Draft

Fix load balancing with FEM#489
s-mayani wants to merge 5 commits intoIPPL-framework:masterfrom
s-mayani:fix_loadbalancing_fem

Conversation

@s-mayani
Copy link
Collaborator

This closes #485

The issue is that after load balancing, the FEM space used by the solver does not update its partitioning of the element indices among ranks according to the new domain decomposition, and the member variable resultField_m also needs to update its layout.

This issue was correctly identified by the AI tool Hugo in #487, but not properly fixed (fixed the symptom instead of the cause).

To fix this properly, a new updateLayout member function is introduced in the LagrangeSpace, which is called by the LoadBalancer in alpine if using the FEM solver, in the same way that the updateLayout of the fields works in the LoadBalancer. This means that every time load balancing is done and the domain decomposition changes, the LagrangeSpace takes the new domain decomposition into account and repartitions the elements too.

@s-mayani s-mayani requested a review from aaadelmann March 25, 2026 08:19
@s-mayani s-mayani self-assigned this Mar 25, 2026
@s-mayani
Copy link
Collaborator Author

Draft PR: I am still waiting for some tests on Alps, which is under maintenance.

@s-mayani s-mayani marked this pull request as ready for review March 25, 2026 10:27
@s-mayani s-mayani enabled auto-merge March 25, 2026 10:28
@s-mayani
Copy link
Collaborator Author

s-mayani commented Mar 25, 2026

Checked with alps, produces correct results even when forcing a load balancing call after 5 time-steps, multi-GPU (2 and 4 GPUs checked).

@s-mayani s-mayani disabled auto-merge March 26, 2026 10:08
@s-mayani s-mayani marked this pull request as draft March 26, 2026 10:08
@s-mayani
Copy link
Collaborator Author

Lumi not ok: 8 GPUs fine, 16 GPUs wrong results. Looking into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Load balancing in Alpine not working for FEM solver on GPU

1 participant