Skip to content

Fix number of stress period end states for multi-model simulations#73

Merged
jlarsen-usgs merged 5 commits into
MODFLOW-ORG:developfrom
hydrocomputing:stress_period_end
May 8, 2025
Merged

Fix number of stress period end states for multi-model simulations#73
jlarsen-usgs merged 5 commits into
MODFLOW-ORG:developfrom
hydrocomputing:stress_period_end

Conversation

@pya
Copy link
Copy Markdown
Contributor

@pya pya commented Apr 1, 2025

Problem

The number of start and end states such as Callback.timestep_start and Callback.timestep_end or Callback.stress_period_start and Callback.stress_period_end handed to the callback function should match in number over the simulation run time.
Furthermore, the states should be grouped in a parenthesis-like fashion with these levels:

initialize
    stress_period_start
        timestep_start
            iteration_start
            iteration_end
        timestep_end
    stress_period_end
finalize

This holds for the model ex-gwf-advtidal from the MODFLOW 6 examples:

counts
model state
ex-gwf-advtidal finalize 1
initialize 1
iteration_end 1099
iteration_start 1099
stress_period_end 4
stress_period_start 4
timestep_end 361
timestep_start 361

But it this does not apply to the model ex-gwe-vsc, which has several sub-models. The count of the state stress_period_end doesn't match the number of stress_period_start state expect for one sub-model:

counts
model state
ex-gwe-vsc-02 iteration_end 2483
iteration_start 2483
stress_period_end 2
stress_period_start 2
timestep_end 732
timestep_start 732
ex-gwf-vsc-01 iteration_end 733
iteration_start 733
stress_period_start 2
timestep_end 732
timestep_start 732
ex-gwf-vsc-02 iteration_end 1464
iteration_start 1464
stress_period_start 2
timestep_end 732
timestep_start 732
ex-gwt-vsc-01 iteration_end 2699
iteration_start 2699
stress_period_start 2
timestep_end 732
timestep_start 732
ex-gwt-vsc-02 finalize 1
initialize 1
iteration_end 2703
iteration_start 2703
stress_period_start 2
timestep_end 732
timestep_start 732

Solution

The problem is the location of the code

if sim_grp.nstp == sim_grp.kstp + 1:
    callback(sim_grp, Callbacks.stress_period_end)

in modflowapi.extension.runner.py after the solution loop. Moving this code inside this loop fixes the matching problem of the stress period states:

counts
model state
ex-gwe-vsc-02 iteration_end 2483
iteration_start 2483
stress_period_end 2
stress_period_start 2
timestep_end 732
timestep_start 732
ex-gwf-vsc-01 iteration_end 733
iteration_start 733
stress_period_end 2
stress_period_start 2
timestep_end 732
timestep_start 732
ex-gwf-vsc-02 iteration_end 1464
iteration_start 1464
stress_period_end 2
stress_period_start 2
timestep_end 732
timestep_start 732
ex-gwt-vsc-01 iteration_end 2699
iteration_start 2699
stress_period_end 2
stress_period_start 2
timestep_end 732
timestep_start 732
ex-gwt-vsc-02 finalize 1
initialize 1
iteration_end 2703
iteration_start 2703
stress_period_end 2
stress_period_start 2
timestep_end 732
timestep_start 732

Tools

The script manualtest.collect_states.py provides functions to run the model with a callback function that records the states and creates the above shown DataFrames:

from collect_states import count_states, run

state_sequence = run(dll, sim_path)
df = count_states(state_sequence)

The function visualize_states allows to show the nesting of the states.

from collect_states import visualize_states, run

state_sequence = run(dll, sim_path)
visualize_states(state_sequence)

Example output:

ex-gwf-advtidal
 initialize
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
...
ex-gwt-vsc-02
 initialize
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
ex-gwf-vsc-01
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
         timestep_end
         timestep_start
             iteration_start
             iteration_end
ex-gwf-vsc-02
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
         timestep_end
         timestep_start
             iteration_start
             iteration_end
ex-gwt-vsc-01
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
ex-gwe-vsc-02
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
         timestep_end
         timestep_start

....

pya added 4 commits April 1, 2025 17:39
The `Callbacks.stress_period_end` needs to be emmitted for each model.
If it it outside this loop it is only emmitted for the last model.
@wpbonelli
Copy link
Copy Markdown
Member

Thanks for this @pya. The state checker is nice. How difficult would it be to write an autotest around it?

@pya
Copy link
Copy Markdown
Contributor Author

pya commented Apr 3, 2025

@wpbonelli
Checking that the number of states match should be easy. The parenthesis-like-open-close matching of states should also be doable. But I'm not sure what to do to turn these into autotests that fit into the testing system. Is it plain pytest or do I need to work with more specific tooling?

@wpbonelli
Copy link
Copy Markdown
Member

It's mostly plain pytest, modulo the question of which models to test and how to get them. Maybe the simplest way is just to replace this line with collect_states.run(...) from your module? That will give us coverage on all the example models. Let me know if that seems reasonable.

Also fyi the tests currently assume libmf6 is in the autotest/ dir

@pya
Copy link
Copy Markdown
Contributor Author

pya commented Apr 3, 2025

Sounds easy enough. But I would need to write two tests and would need to add checking of the state sequences. Looks like I am going to copy this twice. So it could make sense to make it a helper, i.e. a fixture, that lives in conftest.py. What do you think?

@wpbonelli
Copy link
Copy Markdown
Member

If you mean a fixture for the state checker, yes, that sounds good- whatever you think makes the most sense.

@wpbonelli
Copy link
Copy Markdown
Member

What are you thinking re: adding two new tests? I figure we could just use your module in the existing test function and add a few assertions, but maybe I'm thinking about it wrong.

@pya
Copy link
Copy Markdown
Contributor Author

pya commented Apr 4, 2025

I though about adding a new (parameterized) test. But as long as the callback is read-only, i.e. doesn't change any MF6 variables, just modifying the existing test should work. I will do this.

In the long run I see these tests in test_mf6_examples.py:

  1. Run the standalone MF6 executable for each model as ground truth.
  2. Run modlfowapi with a noop callback and compare the results with the ground truth. Start with model_name.lst files. As long as the executable and modlfowapi runs use the same DLL, the model_name.lst should be identical minus some lines with unit numbers, date and times, memory usage etc.
  3. Modify the MF6 input files. E.g. set a well pumping rate to zero for one stress period. Run modlfowapi with a callback that changes the corresponding MF6 variable. E.g. set the well pumping back to the original value at run time. Compare the results. The ground truth and the modlfowapi should have the same output.

What do you think about this strategy? This would require to run the 150+ example multiple times. Is this a problem regarding CI resources? Since all model runs are independent, the problem is embarrassing parallel and parallelization should work automatically for parameterized tests with --dist=load .

@wpbonelli
Copy link
Copy Markdown
Member

I like that approach in the long run, some quick thoughts.

On 2), maybe simplest to start with comparing head/conc/etc. We have some comparison utilities in mf6/devtools we could reuse for that. Comparing log files might be tricky.

On 3), does changing a variable buy us anything new over running the original model and comparing results? I figure test_interface.py is where callbacks are exercised.

@pya
Copy link
Copy Markdown
Contributor Author

pya commented Apr 4, 2025

@wpbonelli

Thanks for your suggestions.

On 2) Yes comparing X results makes sense. Tools for these are certainly helpful. I will have a look at mf6/devtools .

On 3) test_interface.py has tests for specific, small test models. These are necessary and good but may not capture the full complexity of real-world models. mf6examples is a nice collection of real-world or near-real-world models. I am sure running callback-modification tests on them will surface problems that test_interface.py does not capture yet. ;)

@wpbonelli
Copy link
Copy Markdown
Member

Good points @pya. I am happy to help out with the comparison stuff and the expanded testing.

@jlarsen-usgs
Copy link
Copy Markdown
Contributor

@pya @wpbonelli
One question, in the example you provided are the models in a single solution group or in multiple solution groups? The runner groups models in the callback routine by solution group because they share a solution matrix. When working with multiple models that share a solution group, you'll need to check for multiple models in your callback routine's simulation object. We have a simple test for this in the test_two_models test. We could add a little more complexity to that test if indeed there is an issue here that needs to be fixed, or another test that catches the condition we're potentially missing. I'm not so sure/convinced that we need to develop another test to run all of the example problems if we are able to identify and isolate the problem.

The number of start and end states of stress period, time step, and
iteration needs to match for each model. This checks for the counts.
It does not look at the sequence. Furthermore, five models seem to have
other problems with the current and total number of time steps and the
condition `sim_grp.nstp == sim_grp.kstp + 1` in line 113 in `runner.py`
is not met as often as it should. This is a working hypothesis that
requires deeper investigation. Skip the count tests for these models for
now.
@pya
Copy link
Copy Markdown
Contributor Author

pya commented Apr 7, 2025

@jlarsen-usgs

I see your point. I think of the tests for the examples more like a scan for potential problems. Once a problem is identified, adding or modifying an interface test might be the next step. Keeping these scans around as tests applies them automatically to new example models. There might a model that has a new combination of features that the tests didn't consider yet. The scan tests might stumble over a potential problem much earlier than an interface test that needs to be designed to find a problem. This requires anticipating the problem in some way.

How detailed these scan tests should be is a different question. Ideally, the same test should work for all example models. Adding many different cases may make things a bit too complex.

@wpbonelli
Copy link
Copy Markdown
Member

@jlarsen-usgs

in the example you provided are the models in a single solution group or in multiple solution groups? The runner groups models in the callback routine by solution group because they share a solution matrix.

The viscosity example uses a separate solution for each model. I'm still new to the api, but the distribution of models into solution groups should be independent of the number of stress period start/end events raised, right?

I'm not so sure/convinced that we need to develop another test to run all of the example problems if we are able to identify and isolate the problem.

I think what is under consideration here is just extending the existing test function in test_mf6_examples.py with new assertions as needed. Agreed, definitely don't want to run the examples multiple times if we can help it.

@pya
Copy link
Copy Markdown
Contributor Author

pya commented Apr 15, 2025

@jlarsen-usgs

I think we are mixing two topics here:

  1. The content of this PR: The number of stress_period_start and stress_period_end should match for each solution (solution group?). The PR only adds assertions to check for this and applies a fix to the code to make it so. The number of tests, and therefore model runs, it not increased. The callback gathers information during model runs that can be used later to check if the runs worked correctly.
  2. A discussion about possible future additions of more tests. As always, there are advantages and disadvantage here.

Why not separate these two? Let's move the discussion about 2. to a "discussion issue" and make this PR only about 1. Are there any necessary changes to make the PR work? Since the stress period count should be per solution group, there should be no difference between one solution and multiple solutions groups in the tests as they are. Or, I am I mistaken here?

@pya
Copy link
Copy Markdown
Contributor Author

pya commented May 5, 2025

@jlarsen-usgs @wpbonelli
Is there anything I need to do?

@jlarsen-usgs
Copy link
Copy Markdown
Contributor

@pya, just wanted to check and test on my end. The PR looks good and is likely the simplest solution for this issue, merging it in.

@jlarsen-usgs jlarsen-usgs merged commit 5679349 into MODFLOW-ORG:develop May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants