Skip to content

Meeting Notes 2017 Software

Bill Sacks edited this page Apr 30, 2018 · 1 revision

Agenda

  1. Git migration status: https://github.com/escomp/ctsm
  2. First step for bringing together different drainage methods: bringing in the hillslope version of LateralFlowPowerLaw
    • Is the code for rising water table (rsub_top positive) inherently linked to use_hillslope, or should that new code apply always in LateralFlowPowerLaw?
    • Sean: Can you describe a test case that I can run to verify that I'm not breaking anything?
  3. Beginning work on merging together different drainage methods
    • Should I modularize everything that is currently in Drainage / (PerchedLateralFlow + LateralFlowPowerLaw)?
    • Currently the choice of using Drainage vs. (PerchedLateralFlow + LateralFlowPowerLaw) is based on use_aquifer_layer().
      • Is that the logic we want to keep? Or should we make this separately namelist controlled?
      • Will it ALWAYS be the case that you would want to use (a) the stuff in Drainage or (b) the stuff in PerchedLateralFlow + LateralFlowPowerLaw? Or might you want to mix and match?
    • Should I also work on modularizing WaterTable / (PerchedWaterTable + ThetaBasedWaterTable + RenewCondensation)? It looks like these are tied somewhat to the above.
      • Again, what choices are linked vs. independent?
    • Overall question: What choices should be available, and how should they be structured?
      • For example, should we have a top-level choice of whether you're using an aquifer layer, and then, based on the answer to that, we'd have various other choices for different methods for computing fluxes?
      • I'd like to answer this question partly based on what Noah-MP will want to plug in.

LateralFlowPowerLaw

We could have different methods for computing rsub_top. So we have the LateralFlowPowerLaw method for computing rsub_top, and we have the hillslope method for computing rsub_top.

Once we remove all the rsub_top stuff, LateralFlow becomes about a state update.

Eventually (stage 2), we'd jetison the LateralFlow routine and move it to a more formal state update.

Note that rsub_top is essentially net lateral flow.

In previous methods, before hillslope, qin was 0 (so didn't appear at all).

So really what we have is: different methods have different ways of computing qout (i.e., qflx_latflow_out) from each column. Adding hillslopes means that columns also have a non-zero qin.

So we can have multiple methods for computing qflx_latflow_out. First we compute qflx_latflow_out for all columns, using whatever method is chosen. Then, if you have an upstream neighbor, set qflx_latflow_in.

Next steps for hillslope

Should we bring the whole hillslope implementation to master?

No. For now, Bill will reconcile LateralFlowHillslope and do testing both on hillslope and ctsm branch.

WaterTable

To some extent, this has already been modularized.

I could make separately-selectable routines for computing zwt and zwt_perched. But note that zwt_perched should actually be done the same right now for both clm45 and clm50. So the only thing that should differ is how zwt is computed in clm45 vs clm50.

LateralFlow

PerchedLateralFlow should be the same in clm45 and clm50 (but check this).

So then we'll have something like:

call PerchedLateralFlow

if (use_aquifer_layer) then
   call LateralFlowCLM45
else
   call LateralFlowPowerLaw

Possible needs for consistency with WRF boundary layer schemes

Some boundary layer schemes need temperature and exchange coefficients rather than just fluxes.

This is an issue when you have multiple tiles: do you make up a temperature and exchange coefficient that reproduces the average flux?

The way it was implemented in the old CLM-WRF seems wrong.

With CTSM we're not going to do anything with exchange coefficients that WRF may send it.

However, it's probably important that we deal with the need of sending exchange coefficients from CTSM to WRF. This will require some thought: There are a number of edge cases in how Noah-MP handles it that aren't ideal, and probably need to be rethought (particularly with respect to calculating the gridcell-average exchange coefficients).

We also may need to send emissivity and skin temperature rather than longwave radiation itself. It turns out this is actually challenging to do in CLM.

Dave: Let's not sweat this until we have a broader sense of needs from different atmosphere groups.

Martyn asks if we need to worry about this in the (re)design of vegetation fluxes, etc. We don't have a good sense of this, but feeling is: let's proceed without worrying about it, then circle back later if needed.

Other notes on LILAC

LILAC would also have standalone driver capability (through a data atmosphere). e.g., LIS.

git tutorial

Dave L: Should include some stuff about workflow, related to issues, PRs, etc. i.e., what's our new workflow with github.

Some things that could be done to improve ease of use

Mike: A big piece is the toolchain for generating the right input datasets

  • Also including more flexibility in input dataset generation - particularly in terms of subgrid configurations

Mike wrote a script to convert WRF input to a CTSM input file

Sean: Would like to give more flexibility in terms of number of PFTs. Dave agrees: Would like to make it easier to create new PFTs.

Dave: Feels LILAC should be responsible for this sort of spatial flexibility discussed above

Some other things we'd find useful

  • Pulling out more parameters to facilitate parameter estimation efforts
  • Better profiling to see where the code spends its time; accuracy-efficiency tradeoff

Priorities for the remainder of the year

For Bill:

  • Feeling is that highest next priority should be base flow, so we can come closer to wrapping up the soil hydrology work. (Work on this before getting into stuff more useful for isotopes.)
  • Next canopy hydrology (for the sake of isotopes)
    • Maybe in conjunction with Ben: let's talk about who will lead this when we get closer to starting it

Martyn: Will be able to return to some of the numerical methods implementations

Mike:

  • Soil hydrology
  • Snow

Better energy accounting

Sean asks: will CTSM work lend itself to better energy accounting, particularly of energy in water?

Feeling is that CTSM work will help somewhat, but there's a lot more work that would need to happen for that.

More on snow

Can we mimic what was done for SoilHydrology with snow hydrology?

Or would it make sense to just start with the Noah-MP snow model, bypassing the CLM implementation? General feeling is that that may not be ideal.

We could start by talking about snow structure in some ctsm meetings, to figure out what kind of design we want to shoot for.

Some long-term thoughts

For using CLM in weather modeling applications, runtime efficiency is important.

Also, coming up with a way to avoid needing to do a long spinup before doing a short weather modeling run.

  • Dave: A lot of the problem here is just that we don't know what to tell people: how long of a spinup is really needed for particular applications?

Coupling CTSM to WRF: There is a TAMU project to couple WRF with CLM via the CESM coupler. But feeling is that, long-term, we still want to go with the LILAC approach of coupling CLM/CTSM directly to WRF, so that land model call from WRF can occur where it currently does, rather than needing to be at the high level.

Ben: Question about LILAC: Given that, long-term, we would be using nuopc rather than mct for coupling, does it really make sense to plan to use mct for LILAC? Mariana feels yes: we definitely don't want nuopc (it's too heavy-weight), and it doesn't really make sense to have our own custom interface because we'd end up building our own mct-like capabilities.

Ice impedance for infiltration

CLM uses something like this:

qinmax_on_unsaturated_area(c) = 10._r8**(-e_ice*top_icefrac)*(qflx_in_soil(c) - rsurf_vic)

whereas the new noah-mp option uses something like this:

fcr = 1._r8 - exp (-acrt) * sumsoil

To what extent do we want to be able to mix & match different ice impedance options? Martyn points out that this is analogous to the impermeable area – i.e., different versions of parameterizations for impermeable area.

General feeling is that we shouldn't go down that rabbit hole for now.

Order of tackling modules for isotopes

Ben points out that, for the sake of isotopes, there'd be benefit to tackling them in the order in which they're called in a time step.

How to support hierarchies of options – for example, having a higher-level option that turns on vic?

We feel that we need an option somewhere that tells you to turn on all vic hydrology options by default. How should we do this?

  • Could use an xml variable in env_run.xml

  • Should we allow separate xml files for each component, so we'd have env_run_ctsm.xml, to avoid having one big cumbersome xml file?

    We generally like this idea.

  • Should we allow user_nl variables that set a suite, in addition to those corresponding to single namelist options?

For the time being, we'll hack something for vic. And we'll build up to having this for noah-mp.

Most of the discussion was around PRs #12 and #14 (update: these are now https://github.com/ESCOMP/ctsm/pull/191 and https://github.com/ESCOMP/ctsm/pull/192). Some relevant notes have been incorporated into the discussion on

https://github.com/NCAR/clm-ctsm/pull/14 (update: this is now https://github.com/ESCOMP/ctsm/pull/192)

One key point that emerged from the discussion was that Martyn feels that we can make general-purpose routines to apply an explicit Euler or implicit Euler solution method given just the getFlux routine. i.e., a given flux parameterization would just need to provide an implementation for getFlux. This would then allow moving the calls to implicitEuler (and similarly for explicitEuler, not yet implemented) to a shared location. Presumably we could then also move the select case statement (selecting between solution methods) to a shared location, because this would now look the same for any flux parameterization.

One important point that enables this is the realization that, even for fluxes where we currently have an analytical solution for implicitEuler (like QflxH2osfcSurf), we can instead use the general-purpose iterative procedure, and this will amount to the same thing, since the iterative procedure will converge after one iteration. The only downside is that it will be a little bit slower, because it needs to check for convergence, etc.

Martyn is going to take a stab at illustrating what this could look like.

What do we want to stick in the repository?

Suggestion of putting pdfs in github issues rather than in the repo itself.

rst is preferred, but open to anything – with binaries uploaded to issues / PRs rather than put in repo

  • Bill said: I think we need to decide if we're going to accept binary files like these pdfs in the repo. My preference would be to instead keep the original text-based versions in the repo whenever possible. As @mvertens suggested, we may want to standardize on a markup format for these text-based files. For example, if we use restructuredText (.rst), then we can use sphinx to auto-generate browsable documentation - and I think github will also render these rst files itself if you view the file via github. The two reasons for this preference are:
    • 1. Storing a text-based version in the repo makes it possible for you or others to edit the file later, with useful versioning.
    • 2. My impression is that storing too many (large-ish) binary files in a repo can lead to repo bloat, making it take longer to clone the repo, etc.
  • Martyn replied: I totally agree that text-based material (e.g., markdown) is best for documentation. I think that we could use gitHub issues to accomplish the intent of the notes directory. That is, if we are disciplined and open an issue for each of the code development thrusts, then we can upload documents/images etc. to the conversation on the issue. Then we can always come back to this conversation when we want to revisit what we did and write documentation/papers etc.

Meeting notes? On wiki.

So what goes in doc directory? Things for tech note. Maybe other supplementary stuff... with preference for rst, though we can revisit this later.

Presentations: probably also on wiki.

What do we want PRs to be?

Do we want to reserve PRs for something that has been tested? Feeling is that that's not necessary: can use PRs for discussion on in-progress things – as long as they're flagged with something that denotes that.

Feeling is that it's also reasonable to include prototypes in the main repo for the sake of getting feedback on the prototypes.

Where to put helpful github info?

Feeling is the wiki is good for that.

Wiki organization

We'll try to have subfolders for things like

  • workflow
  • meeting notes
  • technical documentation
  • presentations

Technical documentation

Ideally, we'll have a rolling tech note: As you add a new parameterization, also change the tech note

For CLM, we're only including the default parameterization. We need to figure out how to do that now that we have alternative parameterizations. e.g., maybe just have a table giving the different defaults.

Testing workflow

Dave: One good compromise is that scientists can run a short test suite

Generic implicit euler

We generally like Martyn's solution in pr #12

Bill and Ben wonder whether it's worth going to an object-oriented solution – either wrapping the function in a class that has the necessary parameters, or sticking with the function pointer idea, but having a "parameters" class as an input to this generic interface, that could contain any parameters you want (the base class would just be empty).

e.g., Bill likes the "Type­-Bound Procedures" implementation in section 5.3 of Modern Fortran in Practice. (Note afterwards: from a skim, this also looks similar to the solution presented in section 6.2.1 of "Scientific Software Design: The Object-Oriented Way".)

github workflow

Bill: What kind of review process do we want? (1) Should I just choose one person to review all changes? Or multiple people? Or everyone, but you can remove yourself if you're not interested? (2) Want to review trivial changes, too (like the re-parenthesization of the vic code)? (3) Because I have changes on top of changes on top of changes, incorporating PR feedback on an earlier PR is a bit challenging... not sure what to do about this.

Martyn: One solution to (3) could be to open issues for certain things rather than commenting on the PR. Bill agrees that could work well.

Ben: Another solution to (3): Maybe also submit PR at the same time that I kick off testing, to give people more time to review it.

Another solution to (3): Also, maybe could combine a few changes into a single bigger PR.

Reviewers: Request explicit review. Maybe pick one person for science and one for software.

  • e.g., Ben for software reviews

Ben also suggests coming up with review guidelines - e.g., want nit-picky stuff?

Design documentation

Martyn asks if there is a place for higher-level documentation.

Mariana suggests sphinx-based documentation in the main repo, in (say) a doc subdirectory.

Breadth vs. depth

Mariana feels we can make a stronger case for future funding if we go for breadth.

Bill could go for breadth while scientists dig in and rework some of the parameterizations, numerics, etc.

Handling fractions

Martyn and Sean can handle this.

How do we determine if answer changes are okay?

Context: changes to (say) the timing of state updates in the run loop – whether we do partial updates or wait until the end of the time step to do all updates of a given state variable.

Dave: typically do two 10-year runs with the diagnostics package and check visually. Somewhat subjective....

Modularization of base flow

LateralFlowPowerLaw is the main routine.

The problem is that right now it computes the central flux (rsub_top(c)) along with some other stuff, and also does state updates all mixed in. So one goal could be separating this stuff. This is probably worth doing.

I would do the same exercise for the other options (subroutine Drainage). One other target is to have a single routine with switches in the middle, rather than duplicated code.

I'll want to work with Ben to integrate the hillslope stuff in LateralFlow.

Ben's next steps: bringing in hillslope hydrology

Sean suggests having an eye on whether it will run correctly if you set up a surface dataset with only one hillslope column: does this revert to original behavior?

Ben requests a data set that can be used for exercising this.

Small ctsm test suite

I'll set up a small ctsm test suite that covers essential aspects of hydrology, including vic and oldhyd stuff (all on cheyenne... need to add a oldhyd test on cheyenne).

Ben: There's a little workflow for fates that we could borrow.

Handling truncation

Bill: How to handle truncation to zero? Should I put in the idea of truncating to zero if the new value is less than ~ 1e-14 times the old value?

For now use:

!   if (abs(h2osfc(c)) < 1.e-14_r8 * abs(h2osfc_orig)) h2osfc(c) = 0._r8

with 1e-14 a shared constant

qinmax for schemes that don't set it explicitly

Bill: Problem: h2osfc drainage uses qinmax. What to do when we're using a scheme that doesn't have infiltration excess runoff? Do we essentially assume qinmax is infinite in this case, and so all of h2osfc can drain into the soil? Or is this irrelevant, because no infiltration excess runoff will imply no h2osfc? In any case, we need to give some value to qinmax currently....

Conclusion: Set to some huge value like 1e100 (or reuse existing constant)

Vic infiltration excess runoff

Martyn's review of the vic infiltration excess runoff: Should everything that's currently in vic's infiltration excess runoff calculation actually go in its saturated surface runoff calculation, and thus its infiltration excess runoff should be 0? How does this fit with the current saturated excess runoff code that says that the only difference between schemes is in the calculation of fsat?

Change so that, when using use_vichydro, we use an enormous qinmax

We'll move the current vic infl excess into saturated surface excess runoff. It will no longer be the case that the only difference between schemes is in the calculation of fsat.

Update: After a lot of discussion, feeling is that we should (for now) just whack the vic code currently in ComputeQinmaxVic (setting qinmax to near-infinite) and don't bother moving it anywhere. Martyn explained that the current Vic fsat gives the first-order approximation, assuming that you have the equation qsatx = f(fsat). Then what's currently in ComputeQinmaxVic gives a better numerical approximation, given the assumption that there's only one flux operating.

Add a comment that for now we're jettisoning the exact analytical solution.

Handling hard-coded parameters

Feeling is that, as a first step, we'll at least pull parameters out to the top of the relevant module, to facilitate moving them to namelist or netcdf later.

Urban pervious road

Bill: It's more difficult to understand and refactor the code because urban pervious roads are sometimes treated like other hydrologically-active columns, but sometimes treated differently – e.g., at the end of subroutine Infiltration. (See email thread with subject, "Handling qflx_surf when h2osfcflag==0"). I feel that, at some point, this should be reconciled, so that urban pervious roads are treated like all other hydrologically-active columns.

Feeling is: let's have Sean, Keith & Bill talk about this to see how to reconcile these.

Base flow

This may be my next task, maybe working with Ben.

This is partly subroutine Drainage, and partly other routines in SoilHydrologyMod.

Martyn: eventually we may want to treat base flow as a sink term.

Noah-MP has one equation (from Gonzalo) that has grid-to-grid transport in the base flow. Even incorporating the vertical piece from that may be tricky, because there's a moving lower boundary, rather than treating base flow as a sink term.

Justin's hillslope hydrology

This may be another thing to get in sooner rather than later. This could be tied in with base flow stuff.

Update: see email (subject: "Hillslope hydrology", sent 8-16-17)

Naming methods

Naming methods: Always include name of first author and year. Can optionally include other things like equation number or generic name, like:

Beven1979_Topmodel

Though we may have some unpublished things.... e.g., SlaterUnpublished (or maybe just Slater)

Fsat / fff

We'll want FSAT_METHOD_NONE, which sets fsat to 0 everywhere

We should probably pull fff out to a namelist parameter

We might want something like this in the namelist_defaults:

<fff>0.5</fff> <fff sat_surf_runoff_method="noah_mp_opt1">0.75</fff>

Then need to make sure the order dependence is right in CLMBuildNamelist: As a general rule, set the string-based methods first, then set parameter values later, so that parameter defaults can depend on the method you're using.

We may also want to ensure that fff doesn't appear in the namelist if using vichydro, to avoid confusion (as we do for other namelist variables that are irrelevant in some cases).

Modularizing infiltration excess

Probably qinmax is a good thing to allow differences in (like we allow differences in fsat for the saturated surface runoff).

Try to reuse existing code for ice impedance:

10._r8**(-e_ice*top_icefrac)

10._r8**(-e_ice*(icefrac(c,1:3)))

Rename

Rename qflx_sat_surf to qflx_sat_excess_surf

State equations

We'd like to get to some clear state equations

e.g.

d(Sr)/dt = (pr + sm) * (1 - Fsurf) - qsx - qix - qd

may be the state equation for soil water

and

d(Ss)/dt = (pr + sm) * (Fsurf) + qix * (1 - Fsurf) - qd - qo

may be the state equation for surface water

Dealing with the partial update of h2osfc in subroutine Infiltration

(Bill's note to self: see also "Modularizing Infiltration" evernote)

call compute_qflx_in_h2osfc(h2osfc, ...)

! This could be removed if we wanted to do a straight forward Euler, and/or set things up for a more sophisticated solution method. In the latter, rather than having h2osfc_partial, we'd have some more sophisticated state estimate here h2osfc_partial(c) = h2osfc(c) + qflx_in_h2osfc(c) * dtime

call compute_qflx_h2osfc_drain(h2osfc_partial, ...)

Sometime later: Do the state update:

h2osfc = h2osfc + (qflx_in_h2osfc - qflx_h2osfc_drain) * dtime

One implication of this is that the inputs into the routines need to be bare arrays rather than derived types – because the derived types contain the pre-updated state, not the partially-updated state or some other state estimate.

Later follow-up notes

These are some follow-up notes made by Bill Aug 22, 2017

Separating answer-changes from bfb

Note: to separate answer changes from bfb, I may need to first do:

h2osfc = h2osfc_partial - qflx_h2osfc_drain * dtime

then refactor this to the above full update of h2osfc... latter will change answers.

But I may be able to get away with something like:

h2osfc = (h2osfc + qflx_in_h2osfc * dtime) - qflx_h2osfc_drain * dtime

and I may even be able to remove those parentheses

Virtual physics state

Mariana points out: CAM has a "virtual" physics state (or something like that), which stores the partially-updated version of the state as it evolves throughout the time step. But then (she thinks) it uses the real physics state as a starting point for applying the tendencies.

I realized afterwards that we could do something similar by making a copy of waterstate_inst at the beginning of the time step. Then, as the time step progresses, we could update this temporary copy; the temporary copy would be used as an input in some parameterizations. But when it comes time to actually apply the fluxes, we'd apply them to the original waterstate_inst from the beginning of the time step.

My inclination is to wait a bit before doing that, to see if it would be worthwhile. But let's keep it in mind as a possibility.

What I should do next

What I should do next: Move ahead with separating flux calculations from state updates for soil hydrology.

Infiltration makes sense as a next step here, including:

  • separating surface water stuff from infiltration itself
  • separating flux calculations from state updates, as noted above

Present: Martyn Clark, Mike Barlage, Dave Lawrence, Mariana Vertenstein, Bill Sacks

Isotopes

Is it useful to have the general capability for water tracers (ties in with isotopes)?

  • Martyn thinks it could be.

Having explicit fluxes computed as part of solving Richards equation would help with isotopes – i.e., separating flux calculation from state update here. Is this part of the plan? If not, how hard would this be?

  • It sounds like there is already something like this, but maybe not exactly what he needs???

Soil water in Noah-MP

Soil water in Noah-mp: SOILWATER

Do we want to support all of these options? Not necessarily – some are very similar to what's in CLM.

soilwater_moisture_form in CLM

Substepping in noah-mp: niter: 1 or 3, or 6 if precipitation. In clm, compute error, and keep going until you get down to a low enough error

Essentially, the solution of Richards equation (which is what we were looking at in soilwater) seems roughly the same in CLM and Noah-MP

Infiltration

Noah-mp: PDDUM

The different Noah-mp options are computing runsrf

CLM: input is precip minus ET. Things are more complex due to surface water store

CLM: SoilHydrologyMod

We may want a class that is SurfaceRunoff – with the terms needed to compute the surface runoff

2 surface runoff mechanisms:

  • infiltration excess is in Infiltration (but doesn't need to be there necessarily) - when precip exceeds infiltration capacity

First step: modularize SurfaceRunoff (and maybe Infiltration)

For SurfaceRunoff, there will likely be separate routines for:

  1. Calculations of things at the start of SurfaceRunoff
  2. Calculation of Saturated Fraction
  3. Calculate surface runoff trivially based on fcov
  4. Later stuff....

Update: Some of this may be inline rather than in separate routines. I still want to separate out (1), but 2 & 3, and maybe 4, can be combined as:

subroutine SurfaceRunoff

  ! Compute fsat / fcov
  select case (fsat_method)
  case (clm_orig)
     call fsat_clm_orig
  case (vic)
     call fsat_clm_vic
  case (noah_mp_opt1)
     call fsat_noah_mp_opt1
  end select

  do c
     qflx_surf(c) = fcov(c) * qflx_top_soil(c)
  end do

end subroutine SurfaceRunoff

Bill asked Mike: My understanding is that we have some linked processes for SurfaceRunoff and InfiltrationExcess, and that these need to be connected – e.g., if we use clm_default option for SurfaceRunoff we should also use clm_default option for InfiltrationExcess. But those two routines may need to be called at different times, so we can't just have SurfaceRunoffAndInfiltrationExcess_clm_default. If so, then that makes me think maybe we should do this via polymorphism, to make clear the connection between these different routines.

  • Mike says: No: We might want to mix & match – e.g., using the clm_default for surface runoff but the noah-mp option 3 for infiltration excess.

Consider making a new module (class) just for Surface Runoff, which encapsulates the variables that are needed for that process

Later conversation between Bill and Mariana

Mariana suggests: Try to remove as many 'use' statements as possible. At least remove 'use' statements for things like patch, col, landunit... maybe not constants for now.

Mariana feels Bill should NOT spend time on putting unit tests / functional unit tests around the refactored code for now.

Moving to git

Mariana feels that it should be a priority to get CLM to git for this

There are a couple of things that can make this more complicated:

  • Permissions
    • For the short-term, Mariana suggests that we could use a workflow like marbl - copying a git snapshot into svn periodically
  • Maintaining history
    • Workflow for testing & svn externals
      • Do we want to maintain the capability to have a standalone CLM checkout?
      • If so, how?
      • If not, how will we set up the testing workflow for CLM tags?

Code work: where to begin

Richards equation: Pretty similar already between CLM and Noah-MP. But this could still be a good first place to start. This is in SoilWaterMovementMod

Plan is to enumerate things that differ between CLM and Noah-MP

  • Dave: "A list is good. But even a list of 1 would be good to get going"

Probably start with things that will be easy to integrate in, controlled by switches.

Mike is also hoping to set up some timing benchmarks, and play with whether we can get the timings down to the Noah-MP timings by playing with pfts, number of soil levels, etc.

Photosynthesis: Some differences in Noah-MP are:

  • beta parameter for water stress(?)
  • switch between Ball-Berry and Jarvis

Some other differences are:

  • Radiation code
  • Below-canopy turbulence
Clone this wiki locally