Skip to content

Modified code to make st-archive handle DART files#4788

Merged
jedwards4b merged 1 commit intoESMCI:masterfrom
kdraeder:cime_st-arch_dart
May 15, 2025
Merged

Modified code to make st-archive handle DART files#4788
jedwards4b merged 1 commit intoESMCI:masterfrom
kdraeder:cime_st-arch_dart

Conversation

@kdraeder
Copy link
Collaborator

Doing data assimilation with CESM+DART generates many new types of output files,
which st-archive should archive, assuming they are named according to CESM conventions.
Many of the new files will be handled by the esp component code, but others are
more closely associated with the geophysical components and should be archived
with their files. We've chosen new file names in such a way that this can be done
with very small changes to
$component/cime_config/config_archive.xml
cime/CIME/XML/archive_base.py
A second small change is to make st-archive handle compressed files (*.gz).
This has no effect on model calculations or existing output.

Test suite:
cesm3.0_alphabranch (cesm3_0_beta03-121-g0a50f32) in a B compset
(HISTC_CAM70%LT_CLM60%BGC-CROP_CICE_MOM6_MOSART_SGLC_SWAV,
a%ne30np4.pg3_l%ne30np4.pg3_oi%tx2_3v2_r%r05_g%null_w%null_z%null_m%tx2_3v2).
The test used the tags listed in .gitmodules (fxtag = cime6.1.87)
with no development branches checked out.
It was set up to create the greatest variety of files possible (I probably missed some).
The interface to DART is not complete, so files that will result from running DART
were created artificially, since the contents don't matter to st_archive (except for
the restart files which contain the names of restart history files). The script used to create
these files is attached.
dart_files2arch.csh.txt
The test consisted of running a 24 hour forecast, adding the DART files to RUNDIR,
running st_archive, and running a second day to check the restart capability.

The changes were added to a feature branch based on this component's development branch.

Test namelist changes: None
Test status: bit for bit

Partially fixes [CIME Github issue #4451]

User interface changes?: None

Update gh-pages html (Y/N)?: None yet. I don't know if they're needed.
The capability is increased, but it requires nothing from users.

@jasonb5
Copy link
Collaborator

jasonb5 commented Apr 23, 2025

@kdraeder @jedwards4b Would adding another/modifying hist_file_extension in https://github.com/ESMCI/ccs_config_cesm/blob/main/config_archive.xml support this?

@kdraeder
Copy link
Collaborator Author

@jasonb5 Thanks for looking into this so quickly!
DART creates files with names like ${CASE}.cam_0001_d01.e.analysis.1850-01-02-00000.nc.
These were ignored by st_archive (archive_base.py) because the _d01
was not an acceptable part of the component name.
It looks to me like the xml <hist_file_extension> entries only describe
what st_archive should do after the name part has been found to be acceptable.

An alternative solution is to move the 'domain' description to a different part of the file name,
but the other places don't seem to be a good fit.

@jasonb5
Copy link
Collaborator

jasonb5 commented Apr 24, 2025

@kdraeder ah right, you are correct that wouldn't work.

I'm in favor for adding a feature to make this customizable, rather than modifying it for this specific case. This way if other components decide to use a different naming it will only require a config change and not another PR.

@jedwards4b @jgfouca We could add something like hist_unique_name and default it to the existing regex. We could also do this for restart files so we can remove this component specific code (

if compname.find("mpas") == 0 or compname == "mali":
pattern = (
casename
+ r"\."
+ compname
+ r"\."
+ suffix
+ r"\."
+ "_".join(datename_str.rsplit("-", 1))
)
pfile = re.compile(pattern)
restfiles = [f for f in os.listdir(rundir) if pfile.search(f)]
elif compname == "nemo":
pattern = r"_*_" + suffix + r"[0-9]*"
pfile = re.compile(pattern)
restfiles = [f for f in os.listdir(rundir) if pfile.search(f)]
else:
).

@jedwards4b
Copy link
Contributor

poor jedbrown gets hit again! I'm all for your suggested change, but I don't have time to address it right now - do you?

@kdraeder
Copy link
Collaborator Author

The MOM_interface issue #238
might be related enough to include @alperaltuntas view in this decision.

@jasonb5
Copy link
Collaborator

jasonb5 commented Apr 25, 2025

@jedwards4b Opps. Yea I can work on this feature.

@jedwards4b jedwards4b self-assigned this May 6, 2025
@jedwards4b jedwards4b merged commit d4d9fd3 into ESMCI:master May 15, 2025
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Need a complete set of files to test st_archive

3 participants