Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid generating dynamic landunit adjustment fluxes for glacier changes in the first timestep #340

Closed
billsacks opened this issue Apr 11, 2018 · 12 comments
Assignees
Labels
type: bug something is working incorrectly
Milestone

Comments

@billsacks
Copy link
Member

I have struggled with how to handle changes in glacier area in the first timestep of a run. Specifically, I've struggled with the case: If you're doing a hybrid run or a startup run that points to a spunup initial conditions file, and the CTSM initial file disagrees with the CISM initial file. In this case, CTSM will see a change in glacier area in the first time step.

Currently, I've been treating this the same as if a change in glacier area arises during the run loop, with the consequent dynamic landunit adjustment fluxes. I can still envision scenarios where this may be what's wanted. However, the recent experience with an inadvertent glacier area change resulting in huge dynamic landunit adjustment fluxes has led me to feel that, in general, we should let CTSM update to the initial CISM glacier areas in the first time step without generating any adjustment fluxes. That is, it should effectively sync up with CISM areas in initialization, even if the CISM areas differ from what's on the CTSM restart file.

Example of when you might want the current behavior (dynamic landunit adjustment fluxes generated in the first time step): If you run a case with an evolving ice sheet, then run CISM offline for some time to evolve the ice sheet further, then plug in the updated CISM initial conditions file into a hybrid case and expect CLM to generate adjustment fluxes as if you had just continued the first run with CISM coupled in the whole time.

Example of when this causes problems, and the new behavior will be better: If you run a case without an evolving ice sheet, then (unknown to you) CISM changes in a way that affects its initial glacier cover, then you do a hybrid run off of the original case with the new CISM code: With the current code, you'd get big adjustment fluxes in the first time step, but this generally isn't what you want.

I'm becoming more concerned about supporting the second example than the first example.

Incidentally: It turns out that the big adjustment fluxes due to water and energy conservation currently don't affect the coupled system. This is because they are generated in time step -1 of the model (a pre-time step that is done before the real run loop gets underway), and apparently fluxes at that time don't affect the rest of the system. But this is still relevant for its impact on state variables that are adjusted more rigorously with dynamic landunits - currently, carbon and nitrogen states, but eventually some water and energy states as well. I feel that the new, proposed solution will more often do the right thing in this respect - keeping glacier state variables at the values from spunup glacier columns, rather than possibly adjusting them to be a blend of glacier and natural veg state variables in the first time step, if CTSM's glacier areas disagree with CISM glacier areas.

A final reason why I like the new, proposed solution is that I think it is consistent with what we'd get if we could update CTSM to match CISM's glacier areas in initialization, rather than needing to wait until the run loop. That's what I'd really like to do, but the current mct-based driver doesn't allow that. I hope we'll be able to do that when we switch to a NUOPC-based driver/mediator. (Although if we really wanted to maintain the current behavior, we could probably come up with a way to do so even if we're generally getting CISM's glacier area in initialization.)

Note that, in all cases, changes to CISM's glacier areas in the first time step of a branch or restart run will result in adjustment fluxes (the same as if you had run straight through without stopping).

@whlipscomb @lofverstrom @lvankampenhout do any of you have any thoughts on this? I realize these points are subtle, and I'm not sure if I'm explaining this very well. Don't feel the need to spend a lot of time thinking about this, but I wanted to give you the opportunity to weigh in if you want to.

@billsacks billsacks added the type: enhancement new capability or improved behavior of existing capability label Apr 11, 2018
@billsacks billsacks added this to the cesm2 milestone Apr 11, 2018
@billsacks billsacks self-assigned this Apr 11, 2018
@billsacks billsacks added this to To do in Upcoming tags via automation Apr 11, 2018
@billsacks
Copy link
Member Author

billsacks commented Apr 11, 2018

If I make this change: In addition to running the standard test suite, I should also confirm that it doesn't change answers for a 3-day B case mimicking Cecile's setup (with a hybrid start)

Update (8-4-18): Based on revelations today (see #340 (comment)), this could change answers in any case where the finidat file was generated using a different PE layout from the current case. So I'm not going to do this test mentioned above: We have to just understand that this change can lead to roundoff-level changes in a couple of Greenland gridcells in many cases.

@whlipscomb
Copy link

@billsacks, Thanks for the detailed explanation. I agree with your suggested change, for all the reasons you mentioned.

@lofverstrom
Copy link

lofverstrom commented Apr 16, 2018 via email

@billsacks
Copy link
Member Author

@mvertens confirmed that this change makes sense and is consistent with what we'll be able to do once we switch to nuopc.

@billsacks
Copy link
Member Author

@ekluzek gave his okay here, too.

@billsacks billsacks moved this from To do to In progress in Upcoming tags Apr 16, 2018
@billsacks
Copy link
Member Author

I still feel like this is the right thing to do long-term. Specifically, this involves changing:
https://github.com/ESCOMP/ctsm/blob/3061dd933cd8989d75b2acf95f2a106eb61bc8ea/src/main/clm_driver.F90#L199-L200

to just consider is_first_step (at least, I think that's the right logic). I'm pretty sure that's all that's needed.

However: I'm concerned that this will change answers for the ongoing CESM2 runs being done by Cecile and others, because of the buggy CISM initial conditions files (with ice sheet accidentally evolved for a few years) that have resulted in incorrect glacier areas on the CLM initial conditions files in our current hybrid refcases. (This means that, in runs currently being done, CLM invokes the dynamic landunits adjustment code in the first time step of the run, as it adjusts CISM areas to the correct, observed areas. So changing this logic would change behavior in this respect.)

So I'd like to wait to bring this change in until CESM is no longer pointing to any refcases that have old CLM initial files that had these wrong Greenland glacier areas. This may mean timing this just right in the CESM2 release process, or (more likely) it may mean waiting until post-release.

@billsacks
Copy link
Member Author

billsacks commented Aug 4, 2018

I'm not 100% sure, but I think this change could change answers a bit in many cases: I did a set of runs (note to self: see "Plan for turning on Antarctica virtual landunits for cmip6 runs" (evernote:///view/4578570/s44/a65b9d2f-4570-43f9-b0b8-2858cbb3d172/a65b9d2f-4570-43f9-b0b8-2858cbb3d172/)) where I created an initial conditions file using an I compset and then ran it in a B compset, and I saw some dynbal fluxes in time step -1 as well as roundoff-level diffs in some C&N fields. My best guess is that these come from the fact that there was a different PE layout in the case that generated the initial conditions file vs. the B case, leading to small differences in glacier cover, and thus dynamic landunit adjustments in time step -1. (PE layout differences can lead to differences in CISM and in the coupler remapping. In this case, I think the coupler remapping is the most likely culprit, but I'm not positive.)

This finding has two implications:

(1) Fixing this will change answers in many cases

(2) Fixing this seems more important than I originally thought, in order to avoid these oddities moving forward.

@billsacks
Copy link
Member Author

Due to the latest realization, I'm reclassifying this as a bug rather than enhancement.

@billsacks billsacks added type: bug something is working incorrectly and removed type: enhancement new capability or improved behavior of existing capability labels Aug 5, 2018
@billsacks
Copy link
Member Author

I have done a few more tests that make me more confident in my feelings from #340 (comment)

billsacks added a commit to billsacks/ctsm that referenced this issue Aug 5, 2018
Always avoid generating dynamic landunit adjustments for glacier area
changes in the first timestep of a startup or hybrid run - not just for
cold start or interpolated start. See discussion in
ESCOMP#340 for rationale.

Fixes ESCOMP#340
@billsacks
Copy link
Member Author

Checking the refcases in the latest version of CESM:

allactive refcases are b.e20.B1850.f09_g17.pi_control.all.297, b.e20.B1850.f09_g17.pi_control.all.297_transient_v2 and b.e20.BW1850.f09_g17.298. According to http://www.cesm.ucar.edu/working_groups/Atmosphere/development/cesm1_5/, the new land ice initialization came in in run 291, so I think these refcases should have the fixed land ice, though I'm not positive about the BW 298 refcase.

CAM is still pointing to some old refcases, so changes could be bigger for some of those. I haven't looked closely at all of CAM's refcases.

@billsacks
Copy link
Member Author

Via an ERI case and this change:

--- ../../../src/main/clm_driver.F90	2018-08-04 21:17:30.498588654 -0600
+++ SourceMods/src.clm/clm_driver.F90	2018-08-04 21:19:50.164792009 -0600
@@ -204,6 +204,7 @@
     need_glacier_initialization = is_first_step()

     if (need_glacier_initialization) then
+       write(iulog,*) 'WJS: need_glacier_initialization'
        !$OMP PARALLEL DO PRIVATE (nc, bounds_clump)
        do nc = 1, nclumps
           call get_clump_bounds(nc, bounds_clump)

I have confirmed that is_first_step() is triggered in a startup (ERI ref1) and hybrid (ERI ref2) run, but not a branch or continue run (ERI main case) - as desired.

@billsacks
Copy link
Member Author

I ran the test suite last night on billsacks@6269379 . I was surprised to get large differences (greater than roundoff, and in many grid cells in Greenland) for these two tests:

    FAIL ERI_N2_Ld9.f19_g17.I2000Clm50BgcCrop.cheyenne_intel.clm-default BASELINE ctsm1.0.dev006
    FAIL SMS_Lm13.f19_g17.I2000Clm50BgcCrop.cheyenne_intel.clm-cropMonthOutput BASELINE ctsm1.0.dev006

Digging into this: I see large differences in glacier area between the finidat file used in these tests and the restart file generated a couple of days into the ERI run. Looking at the metadata of the finidat file used here: my guess is it came from the finidat_interp_dest.nc file from an interpolation (it was generated from case SMS_Ln1.f19_g17_gl4.I2000Clm50BgcCrop.cheyenne_intel.clm-default.C.ctsm10dev004chlist). That would explain why its glacier areas are far from what they should be.

This is yet another reason why we want to fix this issue: We shouldn't be distinguishing between whether the current run is interpolated or not, because we don't want different answers in these two cases: (1) Run with use_init_interp = .true.; (2) There was a previous run with use_init_interp = .true. that generated an finidat_interp_dest.nc file; the current run points to that file with use_init_interp = .false. (The old code gave different behaviors in those two cases; the new code would give the same behavior in those two cases.)

Upcoming tags automation moved this from To do to Master Tags Done Aug 6, 2018
billsacks added a commit that referenced this issue Aug 6, 2018
Avoid glacier dynamic landunit adjustments in first time step

Always avoid generating dynamic landunit adjustments for glacier area
changes in the first timestep of a startup or hybrid run - not just for
cold start or interpolated start. See the detailed discussion in
#340 for rationale.

Merges PR #470

Fixes #340
@billsacks billsacks removed this from Master Tags Done in Upcoming tags Aug 6, 2018
billsacks added a commit to billsacks/ctsm that referenced this issue Feb 22, 2019
Always avoid generating dynamic landunit adjustments for glacier area
changes in the first timestep of a startup or hybrid run - not just for
cold start or interpolated start. See discussion in
ESCOMP#340 for rationale.

Fixes ESCOMP#340
billsacks added a commit to billsacks/ctsm that referenced this issue Feb 22, 2019
Avoid glacier dynamic landunit adjustments in first time step

Always avoid generating dynamic landunit adjustments for glacier area
changes in the first timestep of a startup or hybrid run - not just for
cold start or interpolated start. See the detailed discussion in
ESCOMP#340 for rationale.

Merges PR ESCOMP#470

Fixes ESCOMP#340
billsacks added a commit to billsacks/ctsm that referenced this issue Nov 1, 2020
We used to need a testmod that turned off GLC_TWO_WAY_COUPLING and
pointed to a compatible initial conditions file. However, this is no
longer needed, because:

- An LII test passes now even with GLC_TWO_WAY_COUPLING on (because
  ESCOMP#340 has been resolved)

- Even if that weren't the case, this compset now uses SGLC, which sets
  GLC_TWO_WAY_COUPLING=FALSE.

- The out-of-the-box initial conditions file is currently compatible
  with this configuration, so there's no need to point to an initial
  conditions file specifically for this test. (We could reintroduce a
  testmods later if we no longer have an out-of-the-box initial
  conditions file compatible with this test... though it may be better
  just to change the test so that it continues to have a compatible
  out-of-the-box initial conditions file, so that we don't need to
  maintain an initial conditions file just for this test.)
billsacks added a commit to billsacks/ctsm that referenced this issue Nov 1, 2020
I don't *think* it is the case any more that having
GLC_TWO_WAY_COUPLING=TRUE makes this test more powerful. (I'm thinking
that this stopped being true once we resolved
ESCOMP#340, which was also when the
standard LII test stopped needing GLC_TWO_WAY_COUPLING=FALSE. But I may
be forgetting some reason why GLC_TWO_WAY_COUPLING is still preferable
in this LII2FINIDATAREAS test.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug something is working incorrectly
Projects
None yet
Development

No branches or pull requests

4 participants