Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restarting from 0th checkpoint does not work #130

Open
harpolea opened this issue Dec 30, 2019 · 7 comments
Open

Restarting from 0th checkpoint does not work #130

harpolea opened this issue Dec 30, 2019 · 7 comments

Comments

@harpolea
Copy link
Member

If a problem is restarted from a checkpoint output at the 0th timestep, then the timestep evolution itself appears to run fine, however the time at the end of the first timestep is 1e99, causing the program to then terminate. I suspect that something is not being initialized correctly?

@doreenfan
Copy link
Collaborator

Seems to be fixed by commit d5d5ef1fa1f9562b9ec4b4f306468e07c5c8b61a

@ajnonaka
Copy link
Contributor

When restarting from checkpoint 0, the chosen dt looks correct now but there are other issues with restart in general. Here is a summary of what I found. These are reacting_bubble tests with 4 MPI (0 OMP).

  1. inputs_3d_regression restarting from chk0000001 works fine.

  2. inputs_3d_regression restarting from chk0000000 dies during Step 3 (create MAC velocities) with an Erroneous arithmetic operation

  3. inputs_3d_amr_regression restarting from chk0000001 dies in the Step 3 MAC projection - MLMG fails to converge

  4. inputs_3d_amr_regression restarting from chk0000000 dies in the Step 2 (make w0) with an Erroneous arithmetic operation

@ajnonaka
Copy link
Contributor

ajnonaka commented Feb 13, 2020

edit: in regards to (2.), the output w0 of make_w0 is complete garbage at r=1 and higher.

@ajnonaka
Copy link
Contributor

If you write out VisMF::Write(S_cc_new[0],"a_S_cc_new"); at the beginning of AdvanceTimeStep() it contains nonsensical values. Maybe the way S_cc_new is initialized if you restart from checkpoint 0 is the problem. This is with 1 MPI process.

@doreenfan
Copy link
Collaborator

Commit cfa5ed7b seems to have resolved (2.). Commit d52443ea resolves (3.) and (4.).

@ajnonaka
Copy link
Contributor

An update on the 4 reacting bubble test problems (4 MPI, 0 OMP):

  1. inputs_3d_regression restarting from chk0000001 works fine.

  2. inputs_3d_regression restarting from chk0000000 runs to completion, but the diffs are large.

  3. inputs_3d_amr_regression restarting from chk0000001 runs to completion, with small diffs (10^-9). Jury still out.

  4. inputs_3d_amr_regression restarting from chk0000000 runs to completion, but the diffs are large.

@ajnonaka
Copy link
Contributor

ajnonaka commented Sep 9, 2021

Works for inputs_2d_regression restarting from chk0000000; so it appears to be a 3D issue only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants