-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very long initialization during texascale run #1052
Comments
And I forgot to mention that the initialization time increases with the number of ranks, the 320M mesh took less time to initialize on fewer ranks. |
does your seissol integrates the fix of: |
Hi,
With stresses and nucleation traction as a constant map:
I remember we encountered a similar problem recently, because easi decided to read the ASAGI files several times. |
20M element mesh on 8 nodes:
Constant Maps:
|
Hi, @Thomas-Ulrich yes, the branch already includes this fix. @sebwolf-de yes, the stress setup is quite complicated. I have logs from Texascale runs last year using the same stress setup, where a 620M element mesh is initialized within 15 min. But last year, I used only one rank per node. Is it expected that doubling the number of ranks will triple the initialization time? |
I don't see that two ranks per node increase the initialization time. Using the 20M mesh, I find: |
Describe the bug
Initialization during texascale run takes 40 min
Expected behavior
Faster Initialization
To Reproduce
Steps to reproduce the behavior:
nico/2nuc-latest-master; v1.1.3-109-g10ae92ee
SeisSol_Release_dskx_6_viscoelastic2; intel-19.1.1
Frontera
Currently Loaded Modules:
parameters.txt
6106531.txt
Additional context
Ridgecrest large domain setup with a mesh with 320 Mio elements
Important:
The branch already contains: #970
The longest initialization step takes about 30 min and happens after printing:
"Initializing Fault, using a quadrature rule with 49 points."
This is weird because the setup does contain relatively few DR elements and similar Ridgecrest setups with more DR elements do not have this issue.
The setup is special because it integrates a very large model domain with several low velocity sediment basins. This means that the difference between the largest and smallest seismic velocity (and time step) is quite large. This is the main difference I can think of compared to other Ridgecrest setups.
The slow initialization of the setup already becomes apparent with a small mesh (2 Mio elements) on 20 nodes (ignore the segfault):
6100231.txt
The text was updated successfully, but these errors were encountered: