Fix number of stress period end states for multi-model simulations by pya · Pull Request #73 · MODFLOW-ORG/modflowapi

pya · 2025-04-01T17:19:05Z

Problem

The number of start and end states such as Callback.timestep_start and Callback.timestep_end or Callback.stress_period_start and Callback.stress_period_end handed to the callback function should match in number over the simulation run time.
Furthermore, the states should be grouped in a parenthesis-like fashion with these levels:

initialize
    stress_period_start
        timestep_start
            iteration_start
            iteration_end
        timestep_end
    stress_period_end
finalize

This holds for the model ex-gwf-advtidal from the MODFLOW 6 examples:

		counts
model	state
ex-gwf-advtidal	finalize	1
	initialize	1
	iteration_end	1099
	iteration_start	1099
	stress_period_end	4
	stress_period_start	4
	timestep_end	361
	timestep_start	361

But it this does not apply to the model ex-gwe-vsc, which has several sub-models. The count of the state stress_period_end doesn't match the number of stress_period_start state expect for one sub-model:

		counts
model	state
ex-gwe-vsc-02	iteration_end	2483
	iteration_start	2483
	stress_period_end	2
	stress_period_start	2
	timestep_end	732
	timestep_start	732
ex-gwf-vsc-01	iteration_end	733
	iteration_start	733
	stress_period_start	2
	timestep_end	732
	timestep_start	732
ex-gwf-vsc-02	iteration_end	1464
	iteration_start	1464
	stress_period_start	2
	timestep_end	732
	timestep_start	732
ex-gwt-vsc-01	iteration_end	2699
	iteration_start	2699
	stress_period_start	2
	timestep_end	732
	timestep_start	732
ex-gwt-vsc-02	finalize	1
	initialize	1
	iteration_end	2703
	iteration_start	2703
	stress_period_start	2
	timestep_end	732
	timestep_start	732

Solution

The problem is the location of the code

if sim_grp.nstp == sim_grp.kstp + 1:
    callback(sim_grp, Callbacks.stress_period_end)

in modflowapi.extension.runner.py after the solution loop. Moving this code inside this loop fixes the matching problem of the stress period states:

		counts
model	state
ex-gwe-vsc-02	iteration_end	2483
	iteration_start	2483
	stress_period_end	2
	stress_period_start	2
	timestep_end	732
	timestep_start	732
ex-gwf-vsc-01	iteration_end	733
	iteration_start	733
	stress_period_end	2
	stress_period_start	2
	timestep_end	732
	timestep_start	732
ex-gwf-vsc-02	iteration_end	1464
	iteration_start	1464
	stress_period_end	2
	stress_period_start	2
	timestep_end	732
	timestep_start	732
ex-gwt-vsc-01	iteration_end	2699
	iteration_start	2699
	stress_period_end	2
	stress_period_start	2
	timestep_end	732
	timestep_start	732
ex-gwt-vsc-02	finalize	1
	initialize	1
	iteration_end	2703
	iteration_start	2703
	stress_period_end	2
	stress_period_start	2
	timestep_end	732
	timestep_start	732

Tools

The script manualtest.collect_states.py provides functions to run the model with a callback function that records the states and creates the above shown DataFrames:

from collect_states import count_states, run

state_sequence = run(dll, sim_path)
df = count_states(state_sequence)

The function visualize_states allows to show the nesting of the states.

from collect_states import visualize_states, run

state_sequence = run(dll, sim_path)
visualize_states(state_sequence)

Example output:

ex-gwf-advtidal
 initialize
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
...

ex-gwt-vsc-02
 initialize
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
ex-gwf-vsc-01
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
         timestep_end
         timestep_start
             iteration_start
             iteration_end
ex-gwf-vsc-02
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
         timestep_end
         timestep_start
             iteration_start
             iteration_end
ex-gwt-vsc-01
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
ex-gwe-vsc-02
     stress_period_start
         timestep_start
             iteration_start
             iteration_end
             iteration_start
             iteration_end
             iteration_start
             iteration_end
         timestep_end
         timestep_start

....

The `Callbacks.stress_period_end` needs to be emmitted for each model. If it it outside this loop it is only emmitted for the last model.

wpbonelli · 2025-04-03T15:35:51Z

Thanks for this @pya. The state checker is nice. How difficult would it be to write an autotest around it?

pya · 2025-04-03T18:18:41Z

@wpbonelli
Checking that the number of states match should be easy. The parenthesis-like-open-close matching of states should also be doable. But I'm not sure what to do to turn these into autotests that fit into the testing system. Is it plain pytest or do I need to work with more specific tooling?

wpbonelli · 2025-04-03T18:43:02Z

It's mostly plain pytest, modulo the question of which models to test and how to get them. Maybe the simplest way is just to replace this line with collect_states.run(...) from your module? That will give us coverage on all the example models. Let me know if that seems reasonable.

Also fyi the tests currently assume libmf6 is in the autotest/ dir

pya · 2025-04-03T19:10:41Z

Sounds easy enough. But I would need to write two tests and would need to add checking of the state sequences. Looks like I am going to copy this twice. So it could make sense to make it a helper, i.e. a fixture, that lives in conftest.py. What do you think?

wpbonelli · 2025-04-03T23:29:54Z

If you mean a fixture for the state checker, yes, that sounds good- whatever you think makes the most sense.

wpbonelli · 2025-04-03T23:33:37Z

What are you thinking re: adding two new tests? I figure we could just use your module in the existing test function and add a few assertions, but maybe I'm thinking about it wrong.

pya · 2025-04-04T05:43:26Z

I though about adding a new (parameterized) test. But as long as the callback is read-only, i.e. doesn't change any MF6 variables, just modifying the existing test should work. I will do this.

In the long run I see these tests in test_mf6_examples.py:

Run the standalone MF6 executable for each model as ground truth.
Run modlfowapi with a noop callback and compare the results with the ground truth. Start with model_name.lst files. As long as the executable and modlfowapi runs use the same DLL, the model_name.lst should be identical minus some lines with unit numbers, date and times, memory usage etc.
Modify the MF6 input files. E.g. set a well pumping rate to zero for one stress period. Run modlfowapi with a callback that changes the corresponding MF6 variable. E.g. set the well pumping back to the original value at run time. Compare the results. The ground truth and the modlfowapi should have the same output.

What do you think about this strategy? This would require to run the 150+ example multiple times. Is this a problem regarding CI resources? Since all model runs are independent, the problem is embarrassing parallel and parallelization should work automatically for parameterized tests with --dist=load .

wpbonelli · 2025-04-04T11:12:27Z

I like that approach in the long run, some quick thoughts.

On 2), maybe simplest to start with comparing head/conc/etc. We have some comparison utilities in mf6/devtools we could reuse for that. Comparing log files might be tricky.

On 3), does changing a variable buy us anything new over running the original model and comparing results? I figure test_interface.py is where callbacks are exercised.

pya · 2025-04-04T14:09:13Z

@wpbonelli

Thanks for your suggestions.

On 2) Yes comparing X results makes sense. Tools for these are certainly helpful. I will have a look at mf6/devtools .

On 3) test_interface.py has tests for specific, small test models. These are necessary and good but may not capture the full complexity of real-world models. mf6examples is a nice collection of real-world or near-real-world models. I am sure running callback-modification tests on them will surface problems that test_interface.py does not capture yet. ;)

wpbonelli · 2025-04-04T14:17:30Z

Good points @pya. I am happy to help out with the comparison stuff and the expanded testing.

jlarsen-usgs · 2025-04-04T15:22:18Z

@pya @wpbonelli
One question, in the example you provided are the models in a single solution group or in multiple solution groups? The runner groups models in the callback routine by solution group because they share a solution matrix. When working with multiple models that share a solution group, you'll need to check for multiple models in your callback routine's simulation object. We have a simple test for this in the test_two_models test. We could add a little more complexity to that test if indeed there is an issue here that needs to be fixed, or another test that catches the condition we're potentially missing. I'm not so sure/convinced that we need to develop another test to run all of the example problems if we are able to identify and isolate the problem.

The number of start and end states of stress period, time step, and iteration needs to match for each model. This checks for the counts. It does not look at the sequence. Furthermore, five models seem to have other problems with the current and total number of time steps and the condition `sim_grp.nstp == sim_grp.kstp + 1` in line 113 in `runner.py` is not met as often as it should. This is a working hypothesis that requires deeper investigation. Skip the count tests for these models for now.

pya · 2025-04-07T04:06:05Z

@jlarsen-usgs

I see your point. I think of the tests for the examples more like a scan for potential problems. Once a problem is identified, adding or modifying an interface test might be the next step. Keeping these scans around as tests applies them automatically to new example models. There might a model that has a new combination of features that the tests didn't consider yet. The scan tests might stumble over a potential problem much earlier than an interface test that needs to be designed to find a problem. This requires anticipating the problem in some way.

How detailed these scan tests should be is a different question. Ideally, the same test should work for all example models. Adding many different cases may make things a bit too complex.

wpbonelli · 2025-04-07T20:23:43Z

@jlarsen-usgs

in the example you provided are the models in a single solution group or in multiple solution groups? The runner groups models in the callback routine by solution group because they share a solution matrix.

The viscosity example uses a separate solution for each model. I'm still new to the api, but the distribution of models into solution groups should be independent of the number of stress period start/end events raised, right?

I'm not so sure/convinced that we need to develop another test to run all of the example problems if we are able to identify and isolate the problem.

I think what is under consideration here is just extending the existing test function in test_mf6_examples.py with new assertions as needed. Agreed, definitely don't want to run the examples multiple times if we can help it.

pya · 2025-04-15T10:08:22Z

@jlarsen-usgs

I think we are mixing two topics here:

The content of this PR: The number of stress_period_start and stress_period_end should match for each solution (solution group?). The PR only adds assertions to check for this and applies a fix to the code to make it so. The number of tests, and therefore model runs, it not increased. The callback gathers information during model runs that can be used later to check if the runs worked correctly.
A discussion about possible future additions of more tests. As always, there are advantages and disadvantage here.

Why not separate these two? Let's move the discussion about 2. to a "discussion issue" and make this PR only about 1. Are there any necessary changes to make the PR work? Since the stress period count should be per solution group, there should be no difference between one solution and multiple solutions groups in the tests as they are. Or, I am I mistaken here?

pya · 2025-05-05T08:20:06Z

@jlarsen-usgs @wpbonelli
Is there anything I need to do?

jlarsen-usgs · 2025-05-08T20:48:43Z

@pya, just wanted to check and test on my end. The PR looks good and is likely the simplest solution for this issue, merging it in.

pya added 4 commits April 1, 2025 17:39

Move Callbacks.stress_period_end into solution loop

2d4452c

The `Callbacks.stress_period_end` needs to be emmitted for each model. If it it outside this loop it is only emmitted for the last model.

Add manual test helper

46e6fb5

Formatting

bc8d94a

Ruff formatting

9cbdefa

jlarsen-usgs merged commit 5679349 into MODFLOW-ORG:develop May 8, 2025

Conversation

pya commented Apr 1, 2025

Problem

Solution

Tools

Uh oh!

wpbonelli commented Apr 3, 2025

Uh oh!

pya commented Apr 3, 2025

Uh oh!

wpbonelli commented Apr 3, 2025

Uh oh!

pya commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wpbonelli commented Apr 3, 2025

Uh oh!

wpbonelli commented Apr 3, 2025

Uh oh!

pya commented Apr 4, 2025

Uh oh!

wpbonelli commented Apr 4, 2025

Uh oh!

pya commented Apr 4, 2025

Uh oh!

wpbonelli commented Apr 4, 2025

Uh oh!

jlarsen-usgs commented Apr 4, 2025

Uh oh!

pya commented Apr 7, 2025

Uh oh!

wpbonelli commented Apr 7, 2025

Uh oh!

pya commented Apr 15, 2025

Uh oh!

pya commented May 5, 2025

Uh oh!

jlarsen-usgs commented May 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pya commented Apr 3, 2025 •

edited

Loading