Plumber2 Implementation #2406

TeaganKing · 2024-03-07T17:10:41Z

This PR will use the TowerSite class (parent to both NEON and PLUBMER sites) and create a Plumber2Site class as well as implement capabilities for running single point simulations at PLUMBER sites.

Contributors other than yourself, if any:
@ekluzek
@adrifoster

CTSM Issues Fixed (include github issue #):
Addresses part of #1487

Are answers expected to change (and if so in what way)?
No, this PR should be BFB. However, it should expand existing capabilities to allow users to run at PLUMBER sites.

Any User Interface Changes (namelist or namelist defaults changes)?
Additional flags will be implemented for run_neon

TeaganKing · 2024-03-08T20:32:07Z

This PR will also address #2186

TeaganKing · 2024-03-08T21:12:54Z

We'll also need to add the new CDEPS tag to this PR.

TeaganKing · 2024-03-08T22:24:19Z

Per conversation with Erik, we'll move this to bfb once the other Plumber-related PRs are in bfb next week.

TeaganKing · 2024-06-25T18:04:02Z

TeaganKing · 2024-08-06T20:38:07Z

We don't currently have restart files for PLUMBER2, which has brought up the question: to what extent do we want to provide support for PLUMBER tower sites? Two options that @wwieder and I discussed this morning are described below:

Users spin up their own AD & postAD cases and then can run a transient case (no restart files needed). This is somewhat contrary to the purpose of run_neon because it requires users to run case setup, build, submit, and archive.
My vote would be that we provide (and update upon changes to model versions) restart files and allow users to run a transient case out of the box (this is how NEON tower sites work, and consistency with run_tower would be nice as well as ease of use, but there is more overhead to maintain the feature). This would be a slightly easier implementation given that run_tower currently expects a transient case (and automatically submits case.run, which is not automatically created for the ad run but is automatically created for transient cases).

@danicalombardozzi @slevis-lmwg @olyson let me know if you have thoughts on this. We were planning to chat on Tuesday at 9am but it looks like there is a NCAR-NEON-Community partnership meeting that will be taking that time slot and I'd ideally like to make a decision before August 20th; also happy to tag up an additional time if that's easier than a GitHub discussion.

olyson · 2024-08-06T23:02:04Z

My guess is that most users will want to change the model and will likely have to do their own spinups anyway. So I might lean toward not providing initial files if it requires significant work to generate, keep track of, and update those files.

dlawrenncar · 2024-08-07T15:06:08Z

I would agree with this. These runs are cheap and the point is to have a set of tower sites that can be run quickly and easily with different model configurations and with new model versions. The ICs quickly become irrelevant as the model changes.

…

On Tue, Aug 6, 2024 at 7:02 PM Keith Oleson ***@***.***> wrote: My guess is that most users will want to change the model and will likely have to do their own spinups anyway. So I might lean toward not providing initial files if it requires significant work to generate, keep track of, and update those files. — Reply to this email directly, view it on GitHub <#2406 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFABYVB663VOB5R7GAWPZJDZQFIYNAVCNFSM6AAAAABELMMUA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZSGMYDEMBTHA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

TeaganKing · 2024-08-07T15:19:47Z

Ok, thank you both for this feedback!

wwieder · 2024-08-07T15:26:06Z

Thanks for weighing in here. Teagan since Dave and Keith came to the same conclusion we did yesterday lets move forward with the plan to support 'spinup' and 'transient' PLUMBER2 cases, without providing initial conditions.

It might be nice to differentiate how we do this for SP (fast spinup, ~20 or 40 years) vs. BGC (more involved, like NEON) cases. Would that be difficult to integrate into the run_towers workflow?

TeaganKing · 2024-08-07T17:14:59Z

That sounds good. I have successfully run the AD & postAD case (with a bit of hard-coding), so this will require the following updates, which should be feasible:

finidat file name updates for each site in user_nl_clm (expecting <SITE>.ad.clm2.r.YYYY-MM-DD-00000.nc files but we have for instance *50400.nc files)
Update run_type defaults in run_case in both tower_site.py and plumber_site.py such that the AD case is the default (instead of the transient default that's set in neon_site.py).

@wwieder To clarify, regarding the option for SP vs BGC, are you envisioning a flag that allows users to pick between these options?

wwieder · 2024-08-07T17:33:11Z

A few comments:

I don't know that we need to "automatically" run postAD and transient runs, as I think it's a good idea for users to have to check the status of their AD and postAD cases to check that they meet steady state criteria (e.g., Alaska NEON sites take significantly longer to spin up than CONUS NEON sites).
What I like about the current run_neon implementation is that running a postAD cases automatically creates a branch that looks for an similarly names AD case and then using the --run from postAD flag also creates a transient case uses the off the postAD ref case and initial conditions. Both of these options just take the last restart files that are available (regardless of simulation date).
Regarding SP vs. BGC cases, yes a flag that allows users to chose between these two options would be excellent. I'm not sure about the best way to implement this and don't want to introduce scope creep / complexity to this work but having the option to set up SP vs. BGC cases would be very useful for PLUMBER2 (and even NEON) cases.

Happy to have a quick that if it's helpful.

TeaganKing · 2024-08-07T17:41:06Z

Ok, in that case, I'll just plan to change the default for PLUMBER to AD and add some documentation on this anticipated workflow for users. Thanks for clarifying!

Why don't I plan to get that first part implemented and then we can talk more about the BGC vs. SP options.

danicalombardozzi · 2024-08-07T17:49:57Z

It seems like perhaps the SP vs BGC option should come later. It will be useful for both PLUMBER and NEON and therefore might require more in-depth work to do it well. We should also create a separate tutorial for people to check the stability of their site spin up. We apply a standard number of years for the AD and post-AD NEON site simulations and it would be useful to help users evaluate the stability after making changes. I don't recall that we evaluated all sites for stability thoroughly, either, so perhaps it would also help us to know if the number of years we use is effective everywhere.

…

On Wed, Aug 7, 2024 at 11:41 AM Teagan King ***@***.***> wrote: Ok, in that case, I'll just plan to change the default for PLUMBER to AD and add some documentation on this anticipated workflow for users. Thanks for clarifying! Why don't I plan to get that first part implemented and then we can talk more about the BGC vs. SP options. — Reply to this email directly, view it on GitHub <#2406 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGHW2QOYGE75H6BHKVQVDGLZQJL4RAVCNFSM6AAAAABELMMUA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZTHE4TINRTGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Dr. Danica Lombardozzi she/her/hers Terrestrial Sciences Section Climate and Global Dynamics NSF National Center for Atmospheric Research Boulder, CO 80305 email: ***@***.*** office: (303) 497-1777

TeaganKing · 2024-08-20T16:26:55Z

This is about ready to go, but once #2485 is in, I'll run the unit and system tests once more and perform a few final tests to make sure the various run_tower flags work as expected.

… neon

…defaults

TeaganKing · 2024-08-23T00:29:02Z

I just wanted to note here a few things that I still need to fix:

Tests for test_sys_run_tower.py are currently failing. Fix and then run all ctsm py tests.
There seem to be some mixups when running with additional flags, eg plumber sites run with 'experiment' end up being run in a transient case instead of an ad case
Need to test various argument options (note some neon options irrelevant for plumber)

TeaganKing · 2024-08-23T18:22:51Z

One limitation of this implementation is that if both neon and plumber cases are run simultaneously (not recommended), that the default run type becomes transient. While this could be changed by adding the following lines at line 197 to tower_arg_parse.py, implementing this as a valid parser option, and then setting run_type dependent on the type of tower in run_tower.py, we run into difficulties when the compset is changed, a step which seems reasonable to not repeat for every site if most times people use this will be for either all-neon or all-plumber sites.

if args.plumber_sites:
    run_type = "default_neon_and_plumber"

Overall, I think it's most reasonable to expect users to run plumber and neon cases separately (although both with the run_tower command), but I'm open to others thoughts.

TeaganKing · 2024-08-23T19:48:50Z

Tests on derecho currently pass for make all and ./run_ctsm_py_tests --sys.

The test_sys_run_tower.py tests take about 30 minutes to run now that it involves setting up three cases; I can remove the ad neon case if this seems too excessive.

wwieder

This is great. Thanks @TeaganKing. My only bigger comment is that by default PLUMBER is typically run in SP mode. I think it's out of scope in this PR to add this additional functionality, but it's something we may want to think about down the road to define compsets with an --SP or --BGC flag

python/ctsm/site_and_regional/plumber_site.py

python/ctsm/site_and_regional/run_tower.py

wwieder · 2024-08-23T20:31:21Z

python/ctsm/site_and_regional/run_tower.py

+    for site_name in valid_plumber_sites:
+
+        # start_year and end_year are set in shell commands, so these get overwritten
+        start_year = 2018


Is this needed? I'll let you investigate.

I changed these start and and years to be very clearly dummy years. I think this is necessary because the PlumberSite class must be implemented with a start/end time. However, this is changed later in the shell commands and is not a known parameter at the time of the object instantiation.

That said, if someone has a better suggestion for implementation, I'm all ears!

python/ctsm/site_and_regional/tower_site.py

wwieder · 2024-08-23T20:43:07Z

python/ctsm/test/test_sys_run_tower.py

+
+
+if __name__ == "__main__":
+    unit_testing.setup_for_tests()


Thanks for adding more testing. Will defer to Erik on extent and timing of tests.

Just to add some detail, this system test now takes 30 minutes (each case setup takes about ten minutes). We currently test:

setup of a NEON site (BART) with a specified experiment and output root

setup of a NEON site (ABBY) AD case and specified output root

setup of a PLUMBER site (AR-SLu) with a specified experiment and output root

If we were going to remove one of these for the sake of time, I would probably remove the ABBY case since it is a secondary NEON case and PLUMBER is also going to run AD as default behavior. This does however test that non-default run types are working, which is useful.

python/ctsm/test/test_unit_run_tower.py

…on comments

TeaganKing · 2024-08-26T19:52:12Z

Hi @ekluzek ,

I have now addressed the comments from @wwieder 's review. Two outstanding issues that Will and I discussed are the following (more details in comments above):

The Plumber2Site class is called in a somewhat clunky way with dummy years that get overwritten. I'm not sure of a better way to do this, and it works, but let me know if you have alternative suggestions?
Running system tests takes a while. There are of course benefits to doing more testing (as I have done here), but the time the tests take may be on the excessive side. Your feedback on whether these tests are worth it would be helpful!

And a software-focused review of the code would be helpful now that Will reviewed some of the functionality pieces.

Before merging (and after @ekluzek 's review), I will be sure to run the following tests one last time (I ran them before Will's review but there are changes that we should test again before merging):

run ./run_tower --plumber-sites with various options as a final check
make all
run ctsm system tests with ./python/run_ctsm_py_tests -s

Lastly, note that there are some related outstanding issues that have been documented that are outside the scope of this PR.

TeaganKing added the PR status: work in progress PR: author feels this is NOT ready to merge to master label Mar 7, 2024

TeaganKing self-assigned this Mar 7, 2024

TeaganKing changed the base branch from master to b4b-dev March 20, 2024 16:04

This was referenced Mar 20, 2024

PLUMBER2 plumbing (csv file, wrapper script, usermods, and scripts). #2155

Merged

Issues with run_neon with the --experiment flag starting in ctsm5.1.dev172 #2433

Closed

TeaganKing mentioned this pull request Jun 25, 2024

Plumber updates ESCOMP/CDEPS#262

Merged

slevis-lmwg mentioned this pull request Jul 2, 2024

Plumber.5.2 #2485

Merged

TeaganKing mentioned this pull request Aug 7, 2024

Clean up run_tower.py #2671

Open

5 tasks

TeaganKing force-pushed the plumber branch from f7f0e9f to bda8f1a Compare August 16, 2024 22:34

TeaganKing added 7 commits August 22, 2024 11:01

preliminary plumber files

7d061f9

include plumber2site values in config_component.xml

a508497

general setup to run plumber

11cafe6

update cime config

fdbc70c

reformat

dd028a3

remove run_length since overwritten by usermods dirs

80e94fe

remove run_length from plumber_site and update args

83b3ed5

TeaganKing and others added 7 commits August 22, 2024 11:07

include plumber filesin gitignore

cc2da4d

update default run type to be ad for plumber and remain transient for…

bc94326

… neon

resolve 'Do not know about batch job case.run' with updated run type …

1970863

…defaults

Update plumber_site.py script documentation

df69f23

Update run_tower.py to do's

3e43097

update testing for neon and plumber

f2e03b2

black formatting for test

4e93b99

TeaganKing force-pushed the plumber branch from b6ccaca to 4e93b99 Compare August 22, 2024 17:13

TeaganKing added 2 commits August 22, 2024 11:29

remove run_lenght arg

5164bb7

testing updates-- test_sys_run_tower still fails

a7462b1

test updates

24897b3

TeaganKing requested review from ekluzek and wwieder August 23, 2024 19:48

TeaganKing added PR status: awaiting review Work on this PR is paused while waiting for review. and removed PR status: work in progress PR: author feels this is NOT ready to merge to master labels Aug 23, 2024

TeaganKing marked this pull request as ready for review August 23, 2024 19:49

Update plumber_site.py with minor comment

0e8375a

TeaganKing changed the title ~~[WIP] Plumber2 Implementation~~ Plumber2 Implementation Aug 23, 2024

wwieder reviewed Aug 23, 2024

View reviewed changes

address some of Will's comments

b98f8ee

samsrabin added enhancement new capability or improved behavior of existing capability science Enhancement to or bug impacting science bfb bit-for-bit labels Aug 26, 2024

TeaganKing added 3 commits August 26, 2024 10:42

add clarifying comments

52c9786

allow skipping of modify_user_nl and also add a few other clarificati…

aa06d7a

…on comments

formatting

39b3023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plumber2 Implementation #2406

Plumber2 Implementation #2406

TeaganKing commented Mar 7, 2024 •

edited

Loading

TeaganKing commented Mar 8, 2024

TeaganKing commented Mar 8, 2024

TeaganKing commented Mar 8, 2024

TeaganKing commented Jun 25, 2024 •

edited

Loading

TeaganKing commented Aug 6, 2024 •

edited

Loading

olyson commented Aug 6, 2024

dlawrenncar commented Aug 7, 2024 via email

TeaganKing commented Aug 7, 2024

wwieder commented Aug 7, 2024

TeaganKing commented Aug 7, 2024 •

edited

Loading

wwieder commented Aug 7, 2024

TeaganKing commented Aug 7, 2024

danicalombardozzi commented Aug 7, 2024 via email

TeaganKing commented Aug 20, 2024 •

edited

Loading

TeaganKing commented Aug 23, 2024 •

edited

Loading

TeaganKing commented Aug 23, 2024 •

edited

Loading

TeaganKing commented Aug 23, 2024

wwieder left a comment

wwieder Aug 23, 2024

TeaganKing Aug 26, 2024 •

edited

Loading

wwieder Aug 23, 2024

TeaganKing Aug 26, 2024

TeaganKing commented Aug 26, 2024

Plumber2 Implementation #2406

Are you sure you want to change the base?

Plumber2 Implementation #2406

Conversation

TeaganKing commented Mar 7, 2024 • edited Loading

TeaganKing commented Mar 8, 2024

TeaganKing commented Mar 8, 2024

TeaganKing commented Mar 8, 2024

TeaganKing commented Jun 25, 2024 • edited Loading

TeaganKing commented Aug 6, 2024 • edited Loading

olyson commented Aug 6, 2024

dlawrenncar commented Aug 7, 2024 via email

TeaganKing commented Aug 7, 2024

wwieder commented Aug 7, 2024

TeaganKing commented Aug 7, 2024 • edited Loading

wwieder commented Aug 7, 2024

TeaganKing commented Aug 7, 2024

danicalombardozzi commented Aug 7, 2024 via email

TeaganKing commented Aug 20, 2024 • edited Loading

TeaganKing commented Aug 23, 2024 • edited Loading

TeaganKing commented Aug 23, 2024 • edited Loading

TeaganKing commented Aug 23, 2024

wwieder left a comment

Choose a reason for hiding this comment

wwieder Aug 23, 2024

Choose a reason for hiding this comment

TeaganKing Aug 26, 2024 • edited Loading

Choose a reason for hiding this comment

wwieder Aug 23, 2024

Choose a reason for hiding this comment

TeaganKing Aug 26, 2024

Choose a reason for hiding this comment

TeaganKing commented Aug 26, 2024

TeaganKing commented Mar 7, 2024 •

edited

Loading

TeaganKing commented Jun 25, 2024 •

edited

Loading

TeaganKing commented Aug 6, 2024 •

edited

Loading

TeaganKing commented Aug 7, 2024 •

edited

Loading

TeaganKing commented Aug 20, 2024 •

edited

Loading

TeaganKing commented Aug 23, 2024 •

edited

Loading

TeaganKing commented Aug 23, 2024 •

edited

Loading

TeaganKing Aug 26, 2024 •

edited

Loading