Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimizing photosynthesis #443

Open
rgknox opened this issue Nov 12, 2018 · 12 comments
Open

optimizing photosynthesis #443

rgknox opened this issue Nov 12, 2018 · 12 comments

Comments

@rgknox
Copy link
Contributor

rgknox commented Nov 12, 2018

@walkeranthonyp, Nathan Collier and I met up at the DOE Modeling PI meeting and had a fun discussion about improving the photosynthesis code. @walkeranthonyp has some more fundamental ways to improve how we go about this process regarding which schemes we use and how the equations themselves are cast, and Nate was interested in how we actual perform out solve for C_i (interstitial CO2), which is an iterative loop converging on a solution to non-linear algebraic equations.

See here where we perform the iteration: https://github.com/NGEET/fates/blob/master/biogeophys/FatesPlantRespPhotosynthMod.F90#L992

An important note here, is that the photosynthesis scheme calculates stomatal conductance (among other things), which in-turn effects latent-heat transfer from the surface, which effects land-atmospheric stability, which should tightly feed-back on these fluxes again. Thus, photosynthesis is called inside this broader iteration loop (CLM/ELM side of code) that seeks to converge on a stability parameter and land-fluxes with a low residual (change since last iteration). Note: These coupled iteration loops make the search for C_i (and the quadratic root finders) the most often called subroutines in the model. We cap this loop iteration at 5 steps currently.

I wanted to diagnose how efficient our code is currently performing the solve for C_I, and see if there is any low-hanging fruit that can be picked to help speed things up and make it more efficient.

The first thing I did was do some counting statistics to see the distributions of how many iterations are performed. The following plot shows results from a 1 year simulation at BCI, using a forest inventory initialization.

Top Left Panel: Histogram of counts needed to converge on the outer loop (stability parameter)
Top Right Panel: Histogram of the final residuals in the C_i loop, using the existing formulation. The red-line is the residual error criteria for success
Bottom Left Panel: Histogram of an ALTERNATIVE way to formulate the residual in the C_i loop, assuming that the residual is the change in C_i from previous time-point, normalized by the canopy CO2 partial pressure.
Bottom Right Panel: Histogram of the number of inner loop counts for C_i (which is capped at 5. Spoiler alert, almost every single iteration attempt used up 5).

quad-bci-v1

So all iterations used up the max 5 attempts. Also, it seemed that the residual is strangely formed. Why is it the difference in C_i normalized by air-pressure and not CO2 partial pressure which should have the same units? I can't rectify the units on the current formulation, it is pascals of partial pressure divided by atmospheric pressure * 1 million? (maybe they thought that C_i was in units of PPM?) But it seems incredibly strict, the residuals at completion (due to step-count maxed) are all way above the error limit.

I was curious if there would be a significant shift in how many iterations were required, if I changed both the way the residual is formed, and gave it a lower success threshold of 1e-5. The residual that is evaluated here is the change in C_i divided by air CO2 partial pressure (which are the same units Pa, and similar magnitudes).

quad-bci-v2

My take home from this exercise is that by adjusting the form of the residual there is a very modest effect on how quickly it converges. Keep in mind here that the physics was not changed, nor was the algorithm changed that searches for a solution.

Nate Collier did some calculations, and thinks that with Newton-Raphson, a solution should be found within 2-3 steps (I think), so I think that would be a worthwhile undertaking, as this would speed up the code. I have to look at the code more thoroughly, to see what method we use now (bracketing?) but it is not Newton-Raphson.

@walkeranthonyp
Copy link

Nice one @rgknox , this is great! A few initial comments:

  • From talking to Nate, the solver in FATES is a fixed point iteration. The difference between the current guess and the old guess is evaluated against the convergence criteria. Once this difference is smaller than the convergence criteria the current guess is deemed the solution. A small note: apparently this is not a residual, residual in solver terminology refers to a function constructed for other numerical methods where the residual = f(x) - x or x - f(x) and x is what you are trying to solve.

  • The normalisation Ci / p * 1e6 is converting Ci in Pa to umol mol-1 (so solved values should be of the rough order 100-1000). The second normalisation Ci / Ca is the Ci:Ca ratio (so will give solved values of the rough order 0.5 - 0.9). So the convergence criteria of 1e-6 is extremely strict, especially when the difference is in units of umol mol-1.

  • Given that extremely strict criteria and the distribution of the difference I would say that the fact that the solution is not within the tolerance is probably not important. At least in this specific BCI test case. The size of this difference might change in other locations with different parameters or environmental variables.

  • Also I wonder what level of precision we need in the photosynthesis solve so that the canopy temperature solver will converge? Seems like that converges in 5 - 10ish loops. But again we have to be careful that this is at BCI, for this PFT, etc.

  • I'm working on a semi-analytical solver that doesn't need any iteration. It finds only an approximate solution but it's pretty damn close - see point above about what level of precision is needed. ELM and CLM4.5 and up use Brent's method, coded up by Jinyun I believe. We could use my method to come up with pretty tight brackets for the Brent method if people are not comfortable with an approximate solution (working on looking at how approximate right now). That would speed things up if Brent's method takes a substantial number of iterations to converge. Also Nate's Newton-Raphson method might work well. Or using previous solutions as the starting guess for the fixed-point iteration.

  • It would be interesting to know how many iterations Brent's method takes to converge in ELM or CLM and how many iterations the canopy temperature solve takes. Presumably about the same number as you show in your plots.

@rgknox
Copy link
Contributor Author

rgknox commented Nov 12, 2018

if the convergence tolerance is 2e-6 micromols, that is 2 picomols / mol. I haven't heard anyone throw picomoles around in a while.

C_i, which is proportional to canopy co2 in partial pressure (pascals), hovers around the 10-50 depending on the century.

@walkeranthonyp
Copy link

@rgknox Haha, every picomole counts :) ... that is an extremely strict criterion. Probably a nanomole is strict enough but we can have that discussion.

More importantly is that this scheme might not converge everywhere (see Sun et al. 2012 JGR-Biogeoscience) and it would be good to understand what precision is need in the photosynthesis solve for the canopy temperature solve to converge. I know we discussed this over email but adding a couple of key points here for the record.

Sun, Y., Gu, L., Dickinson, R.E., 2012. A numerical issue in calculating the coupled carbon and water fluxes in a climate model. J. Geophys. Res. 117, D22103. https://doi.org/10.1029/2012JD018059

@dlawrenncar
Copy link

dlawrenncar commented Nov 13, 2018 via email

@walkeranthonyp
Copy link

@dlawrenncar Right, so CLM4.5 and above don't have the problem of non-convergence. But not FATES as that uses mostly the CLM4 photosynthesis scheme.

The solver that we're working on should help to speed things up in FATES, CLM, ELM etc as it doesn't involve an iterative loop (for leaf scale photosynthesis at least). I'm still testing it to make sure it finds the correct solution under a wide range of conditions. So far, for Medlyn gs it seems very robust but with BB gs for some reason it fails to find an accurate solution in a few cases where vcmax is low and g0 is 0 (if I remember right) which is an unlikely situation. Will look at that again soon.

@dlawrenncar
Copy link

dlawrenncar commented Nov 13, 2018 via email

@walkeranthonyp
Copy link

@dlawrenncar - right with you. Work is progressing this end and I will be presenting the beginnings of this research at AGU in Trevor Keenan and Nick Smith's canopy processes session. I think we will be able to get some speed up compared with the fixed-point iteration in FATES and hopefully the Brent solve in CLM too.

Do you have some numbers on what fraction of time CLM4.5 / 5 spends on the photosynthesis solve (and/or @rgknox for FATES)? If you wouldn't mind I'd like to quote them in my presentation at AGU.

@dlawrenncar
Copy link

@walkeranthonyp - The fraction of time spent on photosynthesis depends, of course, on the configuration of the model. For CLM5, we have timing for the canopy fluxes routine (which includes photosynthesis solution as well as the stability iterations that loop (max iteration 40!) around the photosynthesis calc.; canopy fluxes is finest available timing level in default output)

Prescribed vegetation mode (CLM5SP) - 26%
Prognostic biogeochemistry (CLM5BGC-crop) - 8%
Prognostic BGC with isotopes (CLM5BGC-crop-iso) - 4%

In other testing, as we reduce # of PFTs and other complex complexity (e.g., turn off PHS, reduce soil levels) to try to get a faster model for NWP applications, the cost of canopy fluxes starts to become almost dominant, and therefore priority candidate for improvement to get a faster model.

@alistairrogers
Copy link

alistairrogers commented Nov 28, 2018 via email

@walkeranthonyp
Copy link

@dlawrenncar Thanks!

@rgknox
Copy link
Contributor Author

rgknox commented Nov 28, 2018

Note that FATES is very likely to spend a larger time-share on photosynthesis than non-fates CLM/ELM, due particularly to the fact that we run these calculations over a large number of leaf layers. I don't have timing statistics on me, but will likely start reporting them as we test out alternative solutions to whats been covered in this thread.

@dlawrenncar
Copy link

dlawrenncar commented Nov 28, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ❕Todo
Development

No branches or pull requests

5 participants