Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transient pft issue #538 #540

Merged
merged 7 commits into from
Oct 26, 2018
Merged

Conversation

slevis-lmwg
Copy link
Contributor

@slevis-lmwg slevis-lmwg commented Oct 19, 2018

Description of changes

Corrects the baseline for transient crops in transient pft runs

Specific notes

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #):
#538

Are answers expected to change (and if so in what way)?
Yes, in transient pft runs

Any User Interface Changes (namelist or namelist defaults changes)?

Testing performed, if any:
Ran standard intel & gnu test-suites to generate baseline-13 from branch trans_pft_issue538 while comparing to baseline-12. Expect same or similar FAILures as between branch collapse2gencrop & baseline-12:
Gnu generated same DIFF FAILures + 2 new TPUTCOMP FAILures
Intel generated same DIFF FAILures, except 1 new RUN FAILure: corrected, retested, and now it’s among same DIFF FAILures

Ran same tests bet. branch collapse2gencrop & baseline-13 hoping for no FAILures:
nohup qcmd -l walltime=06:00:00 -- ./create_test --xml-category aux_clm --xml-machine cheyenne --xml-compiler gnu -r /glade/scratch/slevis/collapse_pfts_cheyenne_gnu --project P93300641 --compare /glade/scratch/slevis/cesm_baselines/ctsm1.0.dev013 --generate /glade/scratch/slevis/cesm_baselines/ctsm1.0.dev014 &
The expected FAILs
nohup qcmd -l walltime=12:00:00 -- ./create_test --xml-category aux_clm --xml-machine cheyenne --xml-compiler intel -r /glade/scratch/slevis/collapse_pfts_cheyenne_intel --project P93300641 --compare /glade/scratch/slevis/cesm_baselines/ctsm1.0.dev013 --generate /glade/scratch/slevis/cesm_baselines/ctsm1.0.dev014 &
The expected FAILs except the test that FAILed 2 steps up and I corrected but didn’t “generate” files when I corrected it… Still need to do this!

Fix associated with issue ESCOMP#538.
Minor code mods that allow %cft to advance beyond 1850 in transient pft
simulations.
@billsacks
Copy link
Member

@slevisconsulting thanks a lot for this. So am I understanding right that your collapse2gencrop branch is now bit-for-bit with this one, for all tests (other than the one test that needs baselines generated for it)? If so, I'd say we should move ahead with getting this fix onto master (after doing hobart testing, which it sounds like you haven't yet done). @ekluzek do you agree?

@billsacks billsacks self-requested a review October 19, 2018 20:20
@slevis-lmwg
Copy link
Contributor Author

Bill, yes, that's right. I have a busy few days, but I will try to run hobart testing by Saturday evening. Will let you know.

@billsacks
Copy link
Member

@slevisconsulting - I don't mind if you leave final testing up to @ekluzek or me. I'm about to suggest some additional changes to this branch, so given your time constraints it might make sense for one of us to take over final testing of this branch.

Copy link
Member

@billsacks billsacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we're likely to bring this to the release branch without the follow-up changes in #483 , I'm going to be a little more picky here. Specifically: It seems misleading if do_transient_crops is still present in the namelist but isn't actually used in the code. I'm wondering if it would be possible to change the default logic for do_transient_crops (in CLMBuildNamelist.pl) to achieve a similar result. (I haven't looked closely at this logic to determine this.)

@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Oct 21, 2018

Before I respond, here's progress on hobart standard testing (now complete):

First 3 tests are with the intermediate fix:
nohup ./create_test --xml-category aux_clm --xml-machine hobart --xml-compiler intel -r /fs/cgd/data0/slevis/collapse_pfts_hobart_intel --parallel-jobs 6 --compare /fs/cgd/csm/ccsm_baselines/ctsm1.0.dev012 --generate /fs/cgd/data0/slevis/cesm_baselines/ctsm1.0.dev013 & PASS
nohup ./create_test --xml-category aux_clm --xml-machine hobart --xml-compiler gnu -r /fs/cgd/data0/slevis/collapse_pfts_hobart_gnu --parallel-jobs 6 --compare /fs/cgd/csm/ccsm_baselines/ctsm1.0.dev012 --generate /fs/cgd/data0/slevis/cesm_baselines/ctsm1.0.dev013 & PASS
nohup ./create_test --xml-category aux_clm --xml-machine hobart --xml-compiler nag -r /fs/cgd/data0/slevis/collapse_pfts_hobart_nag --parallel-jobs 6 --compare /fs/cgd/csm/ccsm_baselines/ctsm1.0.dev012 --generate /fs/cgd/data0/slevis/cesm_baselines/ctsm1.0.dev013 & 3 FAILs, all DIFF and all IHist cases as expected

Next 3 tests compare the collapse2gencrop branch with the intermediate fix:
nohup ./create_test --xml-category aux_clm --xml-machine hobart --xml-compiler intel -r /fs/cgd/data0/slevis/collapse_pfts_hobart_intel --parallel-jobs 6 --compare /fs/cgd/data0/slevis/cesm_baselines/ctsm1.0.dev013 --generate /fs/cgd/data0/slevis/cesm_baselines/ctsm1.0.dev014 & PASS
nohup ./create_test --xml-category aux_clm --xml-machine hobart --xml-compiler gnu -r /fs/cgd/data0/slevis/collapse_pfts_hobart_gnu --parallel-jobs 6 --compare /fs/cgd/data0/slevis/cesm_baselines/ctsm1.0.dev013 --generate /fs/cgd/data0/slevis/cesm_baselines/ctsm1.0.dev014 & PASS
nohup ./create_test --xml-category aux_clm --xml-machine hobart --xml-compiler nag -r /fs/cgd/data0/slevis/collapse_pfts_hobart_nag --parallel-jobs 6 --compare /fs/cgd/data0/slevis/cesm_baselines/ctsm1.0.dev013 --generate /fs/cgd/data0/slevis/cesm_baselines/ctsm1.0.dev014 & all FAILs are NLCOMP & TPUTCOMP

@slevis-lmwg
Copy link
Contributor Author

I'm ok with changing the logic in CLMBuildNamelist.pm. As far as I can tell it's a one-line change.

I will commit, push, and you can conduct the testing.

Changing the default logic in CLMBuildNamelist.pm instead of hardwiring
do_transient_crops to always equal do_transient_pfts in the code.

This fix as is may be incomplete. It allows do_transient_crops = .true.
when use_crop = .false. but it does not make sure
do_transient_crops = do _transient_pfts.
@slevis-lmwg
Copy link
Contributor Author

@billsacks, pls read the message in this last commit. I am not sure that the change I made in CLMBuildNamelist.pm is sufficient. I also do not feel comfortable programming in (is it) perl. I would like you to add a clause that makes sure do_transient_crops = do_transient_pft and I will be interested in seeing how you did it for future reference.

@billsacks
Copy link
Member

Thanks, @slevisconsulting , both for the testing and for the alternative change. I'd like to defer to @ekluzek to give this a more careful look/think, because he's the most familiar with the ins and outs of the namelist logic.

@billsacks
Copy link
Member

I'm tentatively approving these changes, though would like to defer to @ekluzek , and then we should redo testing, ideally comparing against the test suite results from @slevisconsulting 's latest runs. (We can generate baselines after-the-fact from those runs if he didn't already do so.)

@slevis-lmwg
Copy link
Contributor Author

slevis-lmwg commented Oct 22, 2018

For my baselines, look here
hobart:/fs/cgd/data0/slevis/cesm_baselines
cheyenne:/glade/scratch/slevis/cesm_baselines

I named the intermediate fix "13" and the collapse branch "14"

@billsacks
Copy link
Member

Thanks, @slevisconsulting !

@ekluzek ekluzek self-assigned this Oct 23, 2018
@ekluzek ekluzek added type: bug - impacts science tag: bug - critical big problems in important configurations priority: high High priority to fix/merge soon, e.g., because it is a problem in important configurations labels Oct 23, 2018
@ekluzek ekluzek added this to the cmip6 milestone Oct 23, 2018
@ekluzek
Copy link
Contributor

ekluzek commented Oct 24, 2018

I pushed some changes, and now I'm testing and verifying answers are the same as @slevisconsulting case.

@ekluzek
Copy link
Contributor

ekluzek commented Oct 25, 2018

Testing shows my changes are bit-for-bit with @slevisconsulting, and only expected fails. So this looks good to go!

@ekluzek ekluzek added the PR status: ready PR: author feels this is ready to merge in label Oct 25, 2018
@ekluzek
Copy link
Contributor

ekluzek commented Oct 25, 2018

Testing is showing differences to baseline only as expected: (Hist SP or Bgc cases only Clm45/Clm50)

../Tools/compare_test_results -r . -t ctsm1d13naga -b ctsm1.0.dev012 | grep BASELINE | grep Hist
FAIL SMS_Ld5_D_P48x1.f10_f10_musgs.IHistClm50Bgc.hobart_nag.clm-monthly BASELINE DIFF
FAIL SMS_Ld5_D_P48x1.f10_f10_musgs.IHistClm50Bgc.hobart_nag.clm-decStart BASELINE DIFF
FAIL ERP_D_P48x1.f10_f10_musgs.IHistClm50Bgc.hobart_nag.clm-decStart BASELINE DIFF
cime/scripts> Tools/compare_test_results -r cases -t ctsm1d13chintela -b ctsm1.0.dev012 | grep BASELINE > list
cime/scripts> grep FAIL list
FAIL ERP_D_Ld5.f19_g17.IHistClm50Bgc.cheyenne_intel.clm-drydepnomegan BASELINE DIFF
FAIL ERP_P36x2_D_Ld5.f10_f10_musgs.IHistClm45BgcCruGs.cheyenne_intel.clm-decStart BASELINE DIFF
FAIL SMS_Ld5.f19_g17.IHistClm50Bgc.cheyenne_intel.clm-decStart BASELINE DIFF
FAIL ERS_D_Ld10.f10_f10_musgs.IHistClm50SpG.cheyenne_intel.clm-glcMEC_decrease BASELINE DIFF
FAIL SMS_D_Ly2.1x1_brazil.IHistClm50BgcQianGs.cheyenne_intel.clm-ciso_bombspike1963 BASELINE DIFF
FAIL ERP_D.f10_f10_musgs.IHistClm50Bgc.cheyenne_intel.clm-decStart BASELINE DIFF
FAIL ERP_P36x2_Lm13.f10_f10_musgs.IHistClm50Bgc.cheyenne_intel.clm-monthly BASELINE DIFF

@danicalombardozzi simulation that is showing correct results is:

dll/clm50_r267_1deg_GSWP3V1_iso_hist_nocrop_transientfix

@ekluzek ekluzek merged commit 8f5c422 into ESCOMP:master Oct 26, 2018
@slevis-lmwg
Copy link
Contributor Author

I see a message here that "the slevisconsulting:trans_pft_issue538 branch can be safely deleted." Do you guys recommend that I delete it or do you keep branches around?

Similar question about issue 538: should one of us hit the "close" button?

@billsacks
Copy link
Member

@slevisconsulting - I delete my branches as soon as they're merged. It's safe to do that because the full history of the branch is still maintained in git.

Thanks for pointing out that #538 wasn't closed. I just went ahead and closed it. For future reference, if you write something like "Fixes #538" in your initial PR comment, it will auto-close when the PR is merged. (There are certain special keywords that cause an issue to auto-close when a PR is merged or a commit comes to master: https://help.github.com/articles/closing-issues-using-keywords/.)

billsacks pushed a commit to billsacks/ctsm that referenced this pull request Feb 22, 2019
@slevis-lmwg slevis-lmwg deleted the trans_pft_issue538 branch February 5, 2021 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR status: ready PR: author feels this is ready to merge in priority: high High priority to fix/merge soon, e.g., because it is a problem in important configurations tag: bug - critical big problems in important configurations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants