Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floating point exception in PHS's plc function #774

Open
billsacks opened this issue Aug 1, 2019 · 3 comments
Open

Floating point exception in PHS's plc function #774

billsacks opened this issue Aug 1, 2019 · 3 comments
Labels
bug something is working incorrectly

Comments

@billsacks
Copy link
Member

billsacks commented Aug 1, 2019

Brief summary of bug

When running in DEBUG mode, PHS's plc function generates a floating point exception in some circumstances. I'm not sure what happens in non-DEBUG mode.

General bug information

CTSM version you are using: release-clm5.0.26

Does this bug cause significantly incorrect results in the model's science? No

Configurations affected: Potentially all configurations with PHS (Clm50 default)? Observed in compset IHistClm50Sp with a start year of 1969 (I'm not sure if we'd observe this if we started from 1850).

Details of bug

I set up a case as follows:

./create_newcase --case oob_0801b --res f19_g17 --compset IHistClm50Sp

./xmlchange RUN_STARTDATE=1969-01-01
./xmlchange STOP_OPTION=nyears,STOP_N=2
./xmlchange JOB_WALLCLOCK_TIME=1:00:00
./xmlchange DEBUG=TRUE

This ran for 5 months, but died on June 1 (model date 19690601) with error, 638:MPT ERROR: Rank 638(g:638) received signal SIGFPE(8). and the following stack trace:

638:MPT: #7  0x0000000001d50eb3 in photosynthesismod::plc (x=3077.1112811201278,
638:MPT:     p=49709, c=15947, level=4, plc_method=0)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/src/biogeophys/PhotosynthesisMod.F90:4888
638:MPT: #8  0x0000000001d3fabb in photosynthesismod::spacf (p=49709, c=15947, x=...,
638:MPT:     f=..., qflx_sun=1.7423788723701508e-07, qflx_sha=3.3277860202329161e-09,
638:MPT:     atm2lnd_inst=..., canopystate_inst=..., waterstate_inst=...,
638:MPT:     soilstate_inst=..., temperature_inst=..., waterflux_inst=...)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/src/biogeophys/PhotosynthesisMod.F90:4653
638:MPT: #9  0x0000000001d2d125 in photosynthesismod::calcstress (p=49709, c=15947,
638:MPT:     x=..., bsun=1, bsha=1, gb_mol=989500.02165873151,
638:MPT:     gs_mol_sun=155650.16359247209, gs_mol_sha=155134.75791357172,
638:MPT:     qsatl=0.0044985860266598997, qaf=0.0035459593784752008, atm2lnd_inst=...,
638:MPT:     canopystate_inst=..., waterstate_inst=..., soilstate_inst=...,
638:MPT:     temperature_inst=..., waterflux_inst=...)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/src/biogeophys/PhotosynthesisMod.F90:4293
638:MPT: #10 0x0000000001d0c6c9 in photosynthesismod::ci_func_phs (x=...,
638:MPT:     cisun=29.108501370856779, cisha=29.10893561709543,
638:MPT:     fvalsun=-6.8697493455910248, fvalsha=-6.8701739416686838, p=49709, iv=1,
638:MPT:     c=15947, bsun=1, bsha=1, bflag=4294967295, gb_mol=989500.02165873151,
638:MPT:     gs0sun=155650.16359247209, gs0sha=155134.75791357172,
638:MPT:     gs_mol_sun=155650.16359247209, gs_mol_sha=155134.75791357172,
638:MPT:     jesun=17.169554658508506, jesha=17.061857082605716,
638:MPT:     cair=31.773491626981734, oair=20547.941476562406,
638:MPT:     lmr_z_sun=0.028769883436967889, lmr_z_sha=0.028769883436967889,
638:MPT:     par_z_sun=238.05876325366955, par_z_sha=180.08649238896118,
638:MPT:     rh_can=0.14863687114448806, qsatl=0.0044985860266598997,
638:MPT:     qaf=0.0035459593784752008, atm2lnd_inst=..., photosyns_inst=...,
638:MPT:     canopystate_inst=..., waterstate_inst=..., soilstate_inst=...,
638:MPT:     temperature_inst=..., waterflux_inst=...)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/src/biogeophys/PhotosynthesisMod.F90:4036
638:MPT: #11 0x0000000001cfec86 in photosynthesismod::hybrid_phs (
638:MPT:     x0sun=29.108501370856779, x0sha=29.10893561709543, p=49709, iv=1, c=15947,
638:MPT:     gb_mol=989500.02165873151, bsun=1, bsha=1, jesun=17.169554658508506,
638:MPT:     jesha=17.061857082605716, cair=31.773491626981734,
638:MPT:     oair=20547.941476562406, lmr_z_sun=0.028769883436967889,
638:MPT:     lmr_z_sha=0.028769883436967889, par_z_sun=238.05876325366955,
638:MPT:     par_z_sha=180.08649238896118, rh_can=0.14863687114448806,
638:MPT:     gs_mol_sun=155650.16359247209, gs_mol_sha=155134.75791357172,
638:MPT:     qsatl=0.0044985860266598997, qaf=0.0035459593784752008, iter1=2, iter2=0,
638:MPT:     atm2lnd_inst=..., photosyns_inst=..., canopystate_inst=...,
638:MPT:     waterstate_inst=..., soilstate_inst=..., temperature_inst=...,
638:MPT:     waterflux_inst=...)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/src/biogeophys/PhotosynthesisMod.F90:3661
638:MPT: #12 0x0000000001cce59c in photosynthesismod::photosynthesishydraulicstress (
638:MPT:     bounds=..., fn=17, filterp=..., esat_tv=..., eair=..., oair=..., cair=...,
638:MPT:     rb=..., bsun=..., bsha=..., btran=..., dayl_factor=..., leafn=...,
638:MPT:     qsatl=..., qaf=..., atm2lnd_inst=..., temperature_inst=...,
638:MPT:     soilstate_inst=..., waterstate_inst=..., surfalb_inst=...,
638:MPT:     solarabs_inst=..., canopystate_inst=..., ozone_inst=0x2b27ca63fd80,
638:MPT:     photosyns_inst=..., waterflux_inst=..., froot_carbon=..., croot_carbon=...)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/src/biogeophys/PhotosynthesisMod.F90:3326
638:MPT: #13 0x00000000011c04eb in canopyfluxesmod::canopyfluxes (bounds=...,
638:MPT:     num_exposedvegp=17, filter_exposedvegp=..., clm_fates=..., nc=1,
638:MPT:     atm2lnd_inst=..., canopystate_inst=..., energyflux_inst=...,
638:MPT:     frictionvel_inst=..., soilstate_inst=..., solarabs_inst=...,
638:MPT:     surfalb_inst=..., temperature_inst=..., waterflux_inst=...,
638:MPT:     waterstate_inst=..., ch4_inst=..., ozone_inst=0x2b27ca63fd80,
638:MPT:     photosyns_inst=..., humanindex_inst=...,
638:MPT:     soil_water_retention_curve=0x2b27ca462080, downreg_patch=...,
638:MPT:     leafn_patch=..., froot_carbon=..., croot_carbon=...)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/src/biogeophys/CanopyFluxesMod.F90:852
638:MPT: #14 0x00000000008848ed in clm_driver::clm_drv (doalb=4294967295,
638:MPT:     nextsw_cday=153, declinp1=0.38482293161813563, declin=0.38482293161813563,
638:MPT:     rstwr=.FALSE., nlend=.FALSE., rdate=..., rof_prognostic=4294967295,
638:MPT:     .tmp.RDATE.len_V$202d6=32)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/src/main/clm_driver.F90:543
638:MPT: #15 0x00000000008470fe in lnd_comp_mct::lnd_run_mct (eclock=..., cdata_l=...,
638:MPT:     x2l_l=..., l2x_l=...)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/src/cpl/lnd_comp_mct.F90:456
638:MPT: #16 0x0000000000461bc1 in component_mod::component_run (eclock=..., comp=...,
638:MPT:     infodata=..., seq_flds_x2c_fluxes=..., seq_flds_c2x_fluxes=...,
638:MPT:     comp_prognostic=4294967295, comp_num=2, timer_barrier=...,
638:MPT:     timer_comp_run=..., run_barriers=.FALSE., ymd=19690602, tod=0,
638:MPT:     comp_layout=..., .tmp.SEQ_FLDS_X2C_FLUXES.len_V$3526=4096,
638:MPT:     .tmp.SEQ_FLDS_C2X_FLUXES.len_V$3529=4096,
638:MPT:     .tmp.TIMER_BARRIER.len_V$352e=19, .tmp.TIMER_COMP_RUN.len_V$3531=11,
638:MPT:     .tmp.COMP_LAYOUT.len_V$3537=32)
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/cime/src/drivers/mct/main/component_mod.F90:728
638:MPT: #17 0x00000000004308d8 in cime_comp_mod::cime_run ()
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/cime/src/drivers/mct/main/cime_comp_mod.F90:2712
638:MPT: #18 0x0000000000449570 in cime_driver ()
638:MPT:     at /glade/work/sacks/ctsm_code/ctsm/cime/src/drivers/mct/main/cime_driver.F90:125

It looks like the issue may be that x is positive in the call to plc, so (x/psi50) is negative, and we get a floating point exception when trying to raise this negative value to a real power. Here x appears to be the root water potential.

This bug was originally found by @Ivanderkelen ; I have reproduced it.

@billsacks billsacks added the bug something is working incorrectly label Aug 1, 2019
@Ivanderkelen
Copy link
Contributor

I found the same bug when starting the run from 1850, as well as from 1900 (in exactly the same settings as described by Bill).

@billsacks
Copy link
Member Author

Given that we haven't seen problems in a production run, we're not going to make this super-high priority. @djk2120 we'll plan to talk to you about this once you're here - though, of course, feel free to look at it before then if you have time.

@djk2120
Copy link
Contributor

djk2120 commented Apr 2, 2020

Indeed, you've diagnosed the issue correctly, in that the input to plc needs to be negative. The simplest fix would be to handle any negative inputs and call them zero. However I can't immediately conceive of how root water potential came to be negative, so it is probably worth me duplicating this error and figuring out how that came to be. That way I can either solve the bug upstream (which seems more robust) or at least convince myself that enforcing a negative input is indeed suitable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something is working incorrectly
Projects
None yet
Development

No branches or pull requests

3 participants