Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

short term archiving fails on maint-1.0 #2422

Closed
tangq opened this issue Jul 2, 2018 · 23 comments
Closed

short term archiving fails on maint-1.0 #2422

tangq opened this issue Jul 2, 2018 · 23 comments

Comments

@tangq
Copy link
Contributor

tangq commented Jul 2, 2018

Short term archiving script fails to move the history files, but the restart files are moved correctly. I used the code of maint-1.0 for some additional DECK runs on Edison.

The $rundir is /global/cscratch1/sd/tang30/ACME_simulations/20180622.DECKv1b_A2.ne30_oEC.edison/run

$casedir: /global/cscratch1/sd/tang30/ACME_simulations/20180622.DECKv1b_A2.ne30_oEC.edison/case_scripts

@rljacob
Copy link
Member

rljacob commented Jul 2, 2018

Can you give more information on how to reproduce the problem? Did you run for some number of years and then executed "case.st_archive" after the run was finished? Any error messages?

@tangq
Copy link
Contributor Author

tangq commented Jul 2, 2018

I ran ./case.st_archive --last-date 1905-01-01 --force-move --no-incomplete-logs > st.txt & in the case directory when the model completed 1905-01-01.

The log file is at /global/cscratch1/sd/tang30/ACME_simulations/20180622.DECKv1b_A2.ne30_oEC.edison/case_scripts/st.txt. You can see some warnings and errors in the st.txt file.

@tangq
Copy link
Contributor Author

tangq commented Jul 3, 2018

@rljacob , I need the short term archiving function to examine whether some ongoing runs are working as expected for the DECK overview paper @golaz is writing. Is there an alternative way to run the short term archiving script before it is fixed?

@rljacob
Copy link
Member

rljacob commented Jul 3, 2018

Try waiting until the run is finished and just running "./case.st_archive" with no arguments.
If that doesn't work, run it again and add the --debug option. This will make a case.st_archive.log file with more info.

@tangq
Copy link
Contributor Author

tangq commented Jul 3, 2018

By "finished" do you mean the completion of one submission or the whole simulation? The simulations are for a few hundred years, I would prefer to run ./case_st_archive --debug after this submission is completed.

@rljacob
Copy link
Member

rljacob commented Jul 3, 2018

completion of one submission.

@tangq
Copy link
Contributor Author

tangq commented Jul 3, 2018

@rljacob , I created a 5-day test at Edison:/global/cscratch1/sd/tang30/ACME_simulations/20180622.DECKv1b_tst_st_archive.ne30_oEC.edison

I saved daily atmospheric output and daily restart files. After the 5-day run, I ran ./case.st_archive --debug > st.txt. It reproduces the issue: restart files were moved to archive directory, but the history files were left in the run directory.

The debug log file is at /global/cscratch1/sd/tang30/ACME_simulations/20180622.DECKv1b_tst_st_archive.ne30_oEC.edison/case_scripts/case.st_archive.log

Hopefully, this test log is helpful.

@jgfouca jgfouca self-assigned this Jul 5, 2018
@jgfouca
Copy link
Member

jgfouca commented Jul 5, 2018

@tangq , could you describe how you created this case? I'd like to try to duplicate this myself.

@tangq
Copy link
Contributor Author

tangq commented Jul 5, 2018

@jgfouca, You can find the run script I used to create this case at ~/tang30/ACME_scripts/tst_st_archive.csh

You should be able to duplicate it by modifying a few lines on the top session of this script. Let me know if you have any problems.

@jedwards4b
Copy link
Contributor

If you are up to date with the cime version you can run ./case.st_archive --test-case and it should show you all of the files it's trying to handle and the disposition of each

@tangq
Copy link
Contributor Author

tangq commented Jul 5, 2018

The simulation cannot use the current master, as it is an additional DECK run.

@jedwards4b
Copy link
Contributor

I can't read any of your directories on edison. I didn't mean to imply that you should update cime, only if it was a new enough version to have that feature. I can't tell you if it is since I can't read the directory.

@tangq
Copy link
Contributor Author

tangq commented Jul 5, 2018

@jedwards4b, are you in group acme on edison. My directory is group readable.

@tangq
Copy link
Contributor Author

tangq commented Jul 5, 2018

I changed my test run's permission (/global/cscratch1/sd/tang30/ACME_simulations/20180622.DECKv1b_tst_st_archive.ne30_oEC.edison). You should be able to read it now.

@rljacob
Copy link
Member

rljacob commented Jul 5, 2018

This is the maint-1.0 branch of E3SM which is using cime5.5.0

@jedwards4b
Copy link
Contributor

So the entries in env_archive are wrong - can you try replacing the hist_file_extension entries for clm and cam with these and see if it solves the problem:
For cam

 <hist_file_extension>h\d*.*\.nc$</hist_file_extension>
    <hist_file_extension>e</hist_file_extension>

for clm:

@jedwards4b
Copy link
Contributor

For clm:

  <hist_file_extension>h\d*.*\.nc$</hist_file_extension>
    <hist_file_extension>e</hist_file_extension>

@jedwards4b
Copy link
Contributor

I'm just using the values from the latest config_archive.xml file (but you can't just use the latest file since it isn't backward compatible)

@tangq
Copy link
Contributor Author

tangq commented Jul 5, 2018

It worked after I changed the env_archive.xml file as you suggested. I will use this manual fix for now.

@rljacob
Copy link
Member

rljacob commented Jul 6, 2018

@jedwards4b thanks for finding the cause and the workaround. @golaz, didn't you use case.st_archive in your DECK runs without any problems?

jgfouca pushed a commit that referenced this issue Jul 6, 2018
Several fixes for problems found during cesm testing

Test suite: scripts_regression_tests.py, ERS_Ld5.f09_g17.B1850.cheyenne_intel.allactive-defaultio,
IRT_Ld7.f09_g17.BHIST.cheyenne_intel.allactive-defaultio, SMS_D.ne30_ne30_mg17.PC5.cheyenne_intel.cam-cam5_port_ne30,

Test baseline:
Test namelist changes:
Test status: bit for bit
Fixes

User interface changes?:

Update gh-pages html (Y/N)?:

Code review: fischer
@tangq
Copy link
Contributor Author

tangq commented Jul 9, 2018

Can we have a PR to fix this issue on maint-1.0? There will be more DECK runs that will be done with maint-1.0.

I don't know which file to modify to change the settings in env_archive.xml. Otherwise, I will create the PR. Thanks.

@rljacob
Copy link
Member

rljacob commented Jul 18, 2018

I'm working on this. But also trying to figure out what happened.

The branch used for most of the v1.0 runs had for the hist_file_extension:
<hist_file_extension>\.h.*.nc$</hist_file_extension>

But right before actually making the v1.0 tag, we brought in cime5.5.0 which has this:
<hist_file_extension>[eh]</hist_file_extension>

And the correct version given by JimE above is a third version.

@jedwards4b
Copy link
Contributor

There was insufficient testing of the archiving capability and changes in the xml file were made with little conformation of their validity. The final version was added after testing for st_archive functionality was added to cime.

@rljacob rljacob closed this as completed in ccf6fa5 Aug 2, 2018
rljacob pushed a commit that referenced this issue May 6, 2021
Several fixes for problems found during cesm testing

Test suite: scripts_regression_tests.py, ERS_Ld5.f09_g17.B1850.cheyenne_intel.allactive-defaultio,
IRT_Ld7.f09_g17.BHIST.cheyenne_intel.allactive-defaultio, SMS_D.ne30_ne30_mg17.PC5.cheyenne_intel.cam-cam5_port_ne30,

Test baseline:
Test namelist changes:
Test status: bit for bit
Fixes

User interface changes?:

Update gh-pages html (Y/N)?:

Code review: fischer
xyuan pushed a commit that referenced this issue Oct 5, 2023
…_flux_2362

Automatically Merged using E3SM Pull Request AutoTester
PR Title: Add surface latent heat flux (upward) for evaporation
PR Author: pbosler
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants