Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processor Distribution Heuristic Failure #323

Open
coreyostrove opened this issue May 31, 2023 · 0 comments
Open

Processor Distribution Heuristic Failure #323

coreyostrove opened this issue May 31, 2023 · 0 comments
Labels
bug A bug or regression
Milestone

Comments

@coreyostrove
Copy link
Contributor

When running GST using MPI with a very large number of cores we encounter what appears to be an edge case with the processor distribution heuristics that results in a distribution of processors that fails the layout creation stage. Attached is a cleaned up log along with a script and related files needed for reproducing this error.

I was running on feature-globally-germ-aware-fpr, but this should be reproducible on the tip of develop. Other relevant parameters:

20-nodes with 36 cores each for a total of 720 processors.
python 3.9.16

Manually specifying a processor grid that is 20x36 looks to alleviate this error.
proc_dist_heuristic_failure.zip

@coreyostrove coreyostrove added the bug A bug or regression label May 31, 2023
@sserita sserita added this to the 0.9.13 milestone Nov 29, 2023
@sserita sserita modified the milestones: 0.9.13, 0.9.14 Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug or regression
Projects
None yet
Development

No branches or pull requests

2 participants