Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix iteration counts change all iter rebase #2882

Open
wants to merge 18 commits into
base: next
Choose a base branch
from

Conversation

dschwoerer
Copy link
Contributor

@dschwoerer dschwoerer commented Mar 14, 2024

Rebase of #2535

TODO:

  • mention breaking change in changelog
  • add test:
    • normal run
    • restart
    • without dump_on_restart
    • with smaller timestep monitor
    • with larger timestep monitor

I have some done some manual testing, but not yet the smaller timestep (and might be wrong)

Previously, `Solver::iteration` was updated in the implementations of
`Solver::run()`, but if there is any monitor that is called more
frequently than the output monitor, this means that `iteration` counts
the number of calls to the more frequent monitor, not the number of
output steps.

Now, have a `SolverMonitor` in the base `Solver` class, that is called
at the output frequency, and updates the iteration count.
Previously, the `Solver` implementations passed the loop counter to the
`iter` argument of `call_monitors`, but since the iteration has already
been completed when the monitors are called, this results in `iter`
always being one less than the number of completed monitor-steps at the
point when the moniters are called, which is confusing.
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

include/bout/solver.hxx Outdated Show resolved Hide resolved
src/solver/impls/arkode/arkode.cxx Outdated Show resolved Hide resolved
src/solver/impls/cvode/cvode.cxx Outdated Show resolved Hide resolved
src/solver/impls/euler/euler.cxx Outdated Show resolved Hide resolved
src/solver/impls/ida/ida.cxx Outdated Show resolved Hide resolved
src/solver/solver.cxx Outdated Show resolved Hide resolved
src/solver/solver.cxx Outdated Show resolved Hide resolved
src/solver/solver.cxx Outdated Show resolved Hide resolved
src/solver/solver.cxx Outdated Show resolved Hide resolved
src/solver/solver.cxx Outdated Show resolved Hide resolved
johnomotani
johnomotani previously approved these changes Mar 14, 2024
Copy link
Contributor

@johnomotani johnomotani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I don't remember now exactly how this was meant to work. If the outputs are working and the iteration counts printed into the log files are correct, then I guess everything should be fine.

When I tried to remind myself what was going on, I came across this line:


With these changes, do we need the ++iter here?

@dschwoerer
Copy link
Contributor Author

Oh, indeed, they are broken.

How about this:

|  Step 0 of 3. Elapsed 0:00:00.0 ETA 0:00:00.1
/  Step 1 of 3. Elapsed 0:00:01.1 ETA 0:00:02.1
-  Step 2 of 3. Elapsed 0:00:01.5 ETA 0:00:00.4
\  Step 3 of 3. Elapsed 0:00:02.1 ETA 0:00:00.0

restart

|  Step 3 of 6. Elapsed 0:00:00.0 ETA 0:00:00.1
/  Step 4 of 6. Elapsed 0:00:01.1 ETA 0:00:02.2
-  Step 5 of 6. Elapsed 0:00:01.9 ETA 0:00:00.8
\  Step 6 of 6. Elapsed 0:00:03.4 ETA 0:00:00.0

@johnomotani
Copy link
Contributor

Sounds good to me @dschwoerer!
👍

This should not be needed anymore, now that we count from 0 to nout.
@dschwoerer
Copy link
Contributor Author

@ZedThree The unit tests are failing, but I did not figure out what is wrong. I am thus hesitant to just change the expected values. The real code seems to do the right thing, but the fake ones seem to be going wrong somewhere ...

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

include/bout/solver.hxx Outdated Show resolved Hide resolved
include/bout/solver.hxx Outdated Show resolved Hide resolved
include/bout/solver.hxx Outdated Show resolved Hide resolved
src/solver/solver.cxx Show resolved Hide resolved
@ZedThree
Copy link
Member

@dschwoerer This is on my to-do list to look at!

@ZedThree
Copy link
Member

I think there's a bug in the current implementation in master and next, or at least behaviour that I find surprising. I've modified the monitor example to print the name of the monitor for clarity, and got this output (on next):

Custom output monitor fast, time = 0.000000e+00, step -1 of 20

Custom output monitor default, time = 0.000000e+00, step -1 of 10
Sim Time  |  RHS evals  | Wall Time |  Calc    Inv   Comm    I/O   SOLVER

0.000e+00          1       4.33e-02    -1.3    0.0    1.6  147.2  -47.4
|  Step 1 of 10. Elapsed 0:00:00.0 ETA 0:00:00.3
Custom output monitor fast, time = 1.000000e+00, step 0 of 10

Custom output monitor fast, time = 2.000000e+00, step 1 of 10

Custom output monitor default, time = 2.000000e+00, step 0 of 5
2.000e+00         92       6.33e-02     5.8    0.0    0.0   50.6   43.6
/  Step 1 of 5. Elapsed 0:00:00.1 ETA 0:00:00.1
Custom output monitor fast, time = 3.000000e+00, step 2 of 10

Custom output monitor fast, time = 4.000000e+00, step 3 of 10

Custom output monitor default, time = 4.000000e+00, step 1 of 5
4.000e+00         41       5.49e-02     4.1    0.0    0.0   45.3   50.7
-  Step 2 of 5. Elapsed 0:00:00.2 ETA 0:00:00.1
Custom output monitor fast, time = 5.000000e+00, step 4 of 10

Custom output monitor fast, time = 6.000000e+00, step 5 of 10

Custom output monitor default, time = 6.000000e+00, step 2 of 5
6.000e+00         44       5.48e-02     3.5    0.0    0.0   46.1   50.3
\  Step 3 of 5. Elapsed 0:00:00.2 ETA 0:00:00.0
Custom output monitor fast, time = 7.000000e+00, step 6 of 10

Custom output monitor fast, time = 8.000000e+00, step 7 of 10

Custom output monitor default, time = 8.000000e+00, step 3 of 5
8.000e+00         41       5.24e-02     3.7    0.0    0.0   45.3   51.0
|  Step 4 of 5. Elapsed 0:00:00.3 ETA 0:00:-0.1
Custom output monitor fast, time = 9.000000e+00, step 8 of 10

Custom output monitor fast, time = 1.000000e+01, step 9 of 10

Custom output monitor default, time = 1.000000e+01, step 4 of 5
1.000e+01         44       5.80e-02     4.2    0.0    0.0   42.8   53.0
/  Step 5 of 5. Elapsed 0:00:00.3 ETA 0:00:-0.1

This is with nout = 10 and timestep = 1. Although the simulation runs to sim time nout * timestep and the relative frequencies of the monitors are all correct, notice that there are only 5 output steps that happen with timestep = 2.

I think the fix is:

modified   src/solver/solver.cxx
@@ -498,6 +498,9 @@ int Solver::solve(int nout, BoutReal timestep) {
 
   finaliseMonitorPeriods(nout, timestep);
 
+  number_output_steps = nout;
+  output_timestep = timestep;
+
   output_progress.write(
       _("Solver running for {:d} outputs with output timestep of {:e}\n"), nout,
       timestep);

This then runs for 10 output steps with timestep = 1, which is what I expect at least

@ZedThree
Copy link
Member

I've got my head around the unit tests too, I'll push a fix for them shortly, along with some improved documentation about how the monitor timesteps work

- Rule of 5
- dynamic_cast instead of static_cast
- multiple declarations on single line
This expresses the intentions much better and makes some issues much clearer
`-1` was required for implementation on `next`, current branch adjusts
the iteration number to start at zero
@ZedThree
Copy link
Member

I've fixed the unit tests by converting them to use a mock monitor to check exactly what's getting called and with what arguments, and writing the tests in terms of nout and nout - 1. I got this working on next and then when porting to this branch, I just checked that removing the -1 made things work.

There's just a few clang-tidy warnings in solver.hxx that need fixing, then I think this is good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants