put date in MOM output filenames #185

aekiss · 2020-01-21T03:53:07Z

At present all the MOM outputs have the same name for every run, e.g. ocean.nc.
I propose we include the run date in the filenames, e.g. ocean_1985_01_01, which is what we already have in the CICE outputs.

This gives many advantages:

users can tell what dates are in which file (a common question, currently dealt with by using the cosima cookbook or run_summary, which is pretty awkward for such a basic thing)
users can easily find files in a required date range via bash shell/script
users can copy all run outputs into one directory without them clobbering each other
importantly, this suits the bash-based workflow of users at BoM and CSIRO (I've had requests from Gary and Paul for this)
this goes some way towards addressing this issue Set netcdf global attributes to record origin of all published .nc files #57

This could be implemented as a post-processing step.

I'm not sure if the filename should include just the starting date or both starting and ending dates (the latter being confusing, as it is midnight of the day after).

The text was updated successfully, but these errors were encountered:

aekiss · 2020-01-21T03:53:33Z

I'm not sure if this would require changes to the cosima cookbook, but if so they'd be small - the cookbook already supports dates in CICE output filenames. Probably only scripts would need to change, not the cookbook itself?

aekiss · 2020-01-21T03:56:28Z

If we decide to go ahead with this, should we apply it to all files already in /g/data/hh5/tmp/cosima/access-om2*, or just new runs? Applying to all would be neater but would require rebuilding the cookbook database and would probably break a lot of user cookbook scripts.

aekiss · 2020-01-21T03:58:38Z

Rather than moving the files to new filenames, we could hard- or sym- link. This would preserve existing workflows while also supporting BoM and CSIRO.

russfiedler · 2020-01-21T04:00:26Z

I did this when creating ensembles of the IAF runs. I'll see if the script survived the transition to Gadi.

russfiedler · 2020-01-21T04:11:01Z

Looks like they went kaput 22 hours ago. The surviving json files I have indicated the I was using names like ocean_temp-%Y-%m.nc.

russfiedler · 2020-01-21T04:15:25Z

Note that the dates can be done automatically via the diag_table. There's no need for special treatment for each time.

aekiss · 2020-01-21T04:23:58Z

Ah, good to know diag_table can do it. I'm just looking it up here: https://github.com/mom-ocean/MOM5/blob/master/src/shared/diag_manager/diag_table.F90#L45

It would be good to include both starting and ending date, e.g. something like ocean_1985_12_01-1986_01_01.nc, since runs have different lengths and this will enable checking that all files over a given interval have been selected. But that doesn't seem possible, right?

In any case, we would need to use something else if we want to process the existing outputs.

russfiedler · 2020-01-21T04:49:53Z

You can dump files periodically, say monthly. This keeps file sizes under control and you can even exploit the parallelism when postprocessing if you want. You can run with variable numbers of months and across years seamlessly.

We use entries like

"ocean_ofam%4yr%2mo",1,"days",1,"days","Time",1,"months" "ocean_model","eta_t","eta_t","ocean_ofam%4yr%2mo","all",.true.,"none",2 "ocean_model","temp","temp","ocean_ofam%4yr%2mo","all",.true.,"none",4 "ocean_model","salt","salt","ocean_ofam%4yr%2mo","all",.true.,"none",4 "ocean_model","u","u","ocean_ofam%4yr%2mo","all",.true.,"none",4 "ocean_model","v","v","ocean_ofam%4yr%2mo","all",.true.,"none",4

to dump monthly files full of daily averages.

aekiss · 2020-01-21T07:11:01Z

Thanks @russfiedler, that's a great tip

aekiss · 2020-01-21T07:12:44Z

@angus-g am I right in thinking no code changes would be needed in the cookbook?

angus-g · 2020-01-21T21:06:06Z

@aekiss that's right, it doesn't matter what the filename is. You can still use % as a wildcard if you need to use the filename for disambiguation too.

AndyHoggANU · 2020-01-21T22:37:45Z

I think this is a good idea.
Let's work out the details when we next start a new run, then think about migrating older data in the future. Based on our experience with saving data for publication (see /g/data/cj50/access-om2) this reprocessing won't be trivial ...

aidanheerdegen · 2020-01-21T22:48:38Z

I think @russfiedler is on the right track. Save each month's data in a separate file, uniquely named, using the diag_table naming capabilities.

This is part of the configuration, so test it and once happy roll it out to the published config. I'd be using dev branches for the configs, in the same way as proposed for code.

I'd would not support changing existing runs. Simply not worth the time/effort IMO.

AndyHoggANU · 2020-01-21T23:33:58Z

Each month?
Or each year/segment, whichever is smaller?

aidanheerdegen · 2020-01-21T23:39:32Z

For the tenth monthly, as you never run for a year. For the quarter and tenth degree, probably yearly.

This has the benefit that whatever the run length the duration of output files would be consistent.

russfiedler · 2020-01-21T23:40:22Z

I think you want consistent sizes throughout the run. It makes checking things much simpler. I'd suggest monthly output and yearly for the others

@aidanheerdegen Great minds think alike!

AndyHoggANU · 2020-01-21T23:40:36Z

True, but we sometimes use 3-monthly output ...

aidanheerdegen · 2020-01-21T23:41:48Z

Now you're just being difficult.

russfiedler · 2020-01-21T23:43:49Z

Three monthly averaged output for the 0.1 or do you mean the others? You'd be putting those outputs in a separate file anyway.

russfiedler · 2020-01-21T23:50:56Z

The entry for the 1 and 0.25 models would be something like

"ocean_3mon%4yr",3,"months",1,"days","Time",1,"years"

so you would have 4 entries per file.

AndyHoggANU · 2020-01-21T23:51:03Z

Yep, I have only ever used 3-monthly for the 01deg case...

russfiedler · 2020-01-21T23:54:44Z

Ah, so that means you have to run for 3 or 6 month segments at the moment, right? You would have an entry like
"ocean_3mon%4yr%2mo",3,"months",1,"days","Time",3,"months"

AndyHoggANU · 2020-01-21T23:56:10Z

Yes, still running with 3-month segments... A 12-hour wall-time limit (or linear scaling up to 12,000 cores) would allow us to do a year at a time.

aidanheerdegen · 2020-01-24T03:02:13Z

Looks like they went kaput 22 hours ago. The surviving json files I have indicated the I was using names like ocean_temp-%Y-%m.nc.

@russfiedler looks like short is still here until the 28th if you wanted to grab something from it

russfiedler · 2020-01-24T04:14:16Z

@aidanheerdegen Ta. I've popped the scripts to do the inking in /scratch/v45/raf599/assim if anybody wants to use them as a starting point. There were only 2 months per segment for runs 96-197 that I was interested in so I didn't have to do anything tricky like parsing a ncdump of the files. It does make loading up an individual month easy and a single year can be loaded by just getting the odd (or even) months.

start from /home/157/amh157/payu/01deg_jra55v13_ryf9091/archive/restart371 using the same config, but use IAF forcing, copied as needed from https://github.com/COSIMA/01deg_jra55_iaf/tree/3411eed79b5b55d8db7b5ddfcbfc111bc9e40abf for - accessom2.nml - atmosphere/forcing.json - config.yaml other changes: - disable all cice output - set up mom outputs - output scalars and 2d surface_temp and eta_t only - 4 hourly - use snapshots - include model date in file - see COSIMA/access-om2#185 - use openMPI4.0.2 executables /g/data/ik11/inputs/access-om2/bin/yatm_575fb04.exe /g/data/ik11/inputs/access-om2/bin/fms_ACCESS-OM_4a2f211_libaccessom2_575fb04.x /g/data/ik11/inputs/access-om2/bin/cice_auscom_3600x2700_722p_365bdc1_libaccessom2_575fb04.exe instead of the openMPI4.0.1 versions /g/data/ik11/inputs/access-om2/bin/yatm_1bb8904.exe /g/data/ik11/inputs/access-om2/bin/fms_ACCESS-OM_97e3429_libaccessom2_1bb8904.x /g/data/ik11/inputs/access-om2/bin/cice_auscom_3600x2700_722p_d3e8bdf_libaccessom2_1bb8904.exe The code differences shouldn't make any scientific difference https://github.com/COSIMA/libaccessom2/compare/1bb8904..575fb04 https://github.com/mom-ocean/MOM5/compare/97e3429..4a2f211 https://github.com/COSIMA/cice5/compare/d3e8bdf..365bdc1 https://github.com/COSIMA/oasis3-mct/compare/d02cc8d896..87a873aa7

aekiss · 2020-03-17T05:02:46Z

I'm trying to come up with a consistent file naming convention for all MOM output at all resolutions in the new configurations. A key objective is to improve data accessibility by making it possible to determine what data is available (variables, temporal sampling, dates) by simply using ls. This would be a big improvement over the current opaque file-naming approach which has hindered uptake of model outputs by others.

Here's a proposed convention:

a separate file for each variable, also disambiguated by sampling frequency (this means lots of files, and a fiddly diag_table, but I think it's worth it for clarity)
filenames consisting of these components:
- ocean_
- data spatial dimensionality (1d-, 2d- or 3d-) - this is technically redundant but very useful for users unfamiliar with MOM diagnostic names
- netcdf variable name (NB: separated from previous and next components by - instead of _ to facilitate parsing, since CF-compliant variable names can contain _ but not -)
- and time info for non-static data:
  - sampling period within file, e.g. -4hourly, -daily, -5daily, -monthly, -3monthly, -yearly
  - reduction method: whether each sample is a time-mean over the sampling period (_mean) or a _snapshot at the end of the sampling period (corresponding to the 6th item in the diag_table field line being .true. or .false., respectively), or something else (e.g. _rms, _pow02, _min, _max, etc - see https://github.com/mom-ocean/MOM5/blob/master/src/shared/diag_manager/diag_table.F90#L159)
  - _<year>[_<month>[_<day>]] for the start of the first sampling interval in file - only as many as needed for disambiguation. 4 digits for years, 2 for month, 2 for day, with leading zeros as needed (to ensure sensible alphabetic sorting), achieved by %4yr, %4yr%2mo or %4yr%2mo%2dy in the diag_table entry
  - finally the temporal length of the file (based on new_file_freq, new_file_freq_units in file line of diag_table) e.g. _1month, _3months, _5years just to make it clear how much data is in the file (since this can vary independently of the sampling period)

This order of components is designed to sort alphabetically in a helpful way.

Examples:

ocean_2d-geolon_t.nc                                  # static grid data: no sampling or date info
ocean_1d-ke_tot-monthly_mean_1990_1year.nc            # 12 monthly means in one file
ocean_2d-sea_level-monthly_mean_1990_04_1month.nc     # a single 1-month mean
ocean_3d-temp-monthly_mean_1990_04_3months.nc         # three 1-month means
ocean_3d-temp-3monthly_mean_1990_04_3months.nc        # a single 3-month mean
ocean_3d-salt-daily_snapshot_1990_04_1month.nc        # a month of daily snapshots
ocean_3d-salt-daily_snapshot_1990_04_01_1day.nc       # daily snapshots, one file per day

achieved by these diag_table specifications

"ocean_2d-geolon_t", -1, "months", 1, "days", "time"
"ocean_model","geolon_t","geolon_t","ocean_2d-geolon_t","all",.false.,"none",2

"ocean_1d-ke_tot-monthly_mean%4yr_1year", 1,  "months", 1, "days", "time", 12, "months"
"ocean_model","ke_tot","ke_tot", "ocean_1d-ke_tot-monthly_mean%4yr_1year","all",.true.,"none",1

"ocean_2d-sea_level-monthly_mean%4yr%2mo_1month", 1,  "months", 1, "days", "time", 1, "months"
"ocean_model","sea_level","sea_level", "ocean_2d-sea_level-monthly_mean%4yr%2mo_1month","all",.true.,"none",2

...

"ocean_3d-salt-daily_snapshot%4yr%2mo%2dy_1day", 1,  "days", 1, "days", "time", 1, "days"
"ocean_model","salt","salt", "ocean_3d-salt-daily_snapshot%4yr%2mo%2dy_1day","all",.false.,"none",2

Does that seem OK to people? (ping @AndyHoggANU, @aidanheerdegen, @russfiedler)
It might look like overkill but I think it will be helpful in the long run to have a systematic approach that will cover all current and likely future needs.

I'm not sure whether we should to do something like this for CICE output too. Each file includes lots of static grid data so that's an argument to retain our current approach of saving many CICE variables per file.

aekiss · 2020-03-17T05:05:35Z

@angus-g would a large increase in the number output files cause problems for the COSIMA Cookbook? And will this file naming convention suit the way the cookbook concatenates files on the time axis (e.g. if the final filename component varies during a run)?

angus-g · 2020-03-17T05:12:21Z

The database part of the cookbook shouldn't have any issues with more files. There would probably be a small increase in the size of the database itself, but I can't see queries getting noticeably slower. For concatenation, the filenames themselves don't matter: the files are sorted by the start time obtained from the time dimension data.

The only one thing I could see causing a change from the current behaviour is that we can't quite rely on the same form of filename-based disambiguation. It would be harder for the cookbook to suggest that a query is erroneous, but we can still pass patterns (like ocean_3d_%) to select only a subset of filenames.

AndyHoggANU · 2020-03-17T09:41:56Z

Wow, OK, I think I like it.
It would certainly make it easier to publish the data, because these filenames will most satisfy the requirements of published data. A few points:

I am worried about whether we will blow our iNodes quote on NCI to smithereens. We should be able to calculate it?
It is worth socialising the idea at this week's MOM meeting. I would like to know from others who are not using the cookbook whether this will meet requirements.
I will defer to @angus-g on cookbook matters, so that sounds OK.

aekiss · 2020-05-21T11:25:45Z

I've written a script to automatically generate diag_table with this new output file format - see https://github.com/COSIMA/make_diag_table.

With this, users will only need to modify a very clean and non-repetitive diag_table_source.yaml file, which make_diag_table.py will read to generate diag_table. It's general enough that it should be unnecessary to hand-edit the diag_table file.

aekiss · 2020-05-22T01:20:10Z

Unfortunately MOM insists on putting _ (underscore) before the date, so we get filenames like
ocean-2d-tx_trans_int_z-1-monthly-_1958_01_16.nc.

The -_ looks odd but is probably preferable to
ocean-2d-tx_trans_int_z-1-monthly_1958_01_16.nc
because it allows "fields" to be consistently split by - (dash).

Eliminating the leading underscore would be neater:
ocean-2d-tx_trans_int_z-1-monthly-1958_01_16.nc
but would require a code change in MOM, with a namelist variable to retain the current behaviour by default. Not sure if that's worth the bother.

Also notice that I've decided to retain the 1- in 1-monthly so that the number of "fields" is consistent.

aekiss · 2020-05-22T01:27:29Z

Example output files are here (this is a 3-month test run with monthly outputs):
/scratch/v45/aek156/access-om2/archive/1deg_jra55_iaf_v2.0.0rc1/output000/ocean/
Notice that the day field in the date is the middle of the averaging period.
Comments welcome. It is super easy to change the file name convention with the make_diag_table.py script, so if you don't like it, let me know!

ocean-2d-area_t.nc
ocean-2d-area_u.nc
ocean-2d-bmf_u-1-monthly-_1958_01_16.nc
ocean-2d-bmf_u-1-monthly-_1958_02_15.nc
ocean-2d-bmf_u-1-monthly-_1958_03_16.nc
ocean-2d-bmf_v-1-monthly-_1958_01_16.nc
ocean-2d-bmf_v-1-monthly-_1958_02_15.nc
ocean-2d-bmf_v-1-monthly-_1958_03_16.nc
ocean-2d-drag_coeff.nc
ocean-2d-dxt.nc
ocean-2d-dxu.nc
ocean-2d-dyt.nc
ocean-2d-dyu.nc
ocean-2d-eta_t-1-monthly-_1958_01_16.nc
ocean-2d-eta_t-1-monthly-_1958_02_15.nc
ocean-2d-eta_t-1-monthly-_1958_03_16.nc
ocean-2d-evap-1-monthly-_1958_01_16.nc
ocean-2d-evap-1-monthly-_1958_02_15.nc
ocean-2d-evap-1-monthly-_1958_03_16.nc
ocean-2d-frazil_3d_int_z-1-monthly-_1958_01_16.nc
ocean-2d-frazil_3d_int_z-1-monthly-_1958_02_15.nc
ocean-2d-frazil_3d_int_z-1-monthly-_1958_03_16.nc
ocean-2d-geolat_c.nc
ocean-2d-geolat_t.nc
ocean-2d-geolon_c.nc
ocean-2d-geolon_t.nc
ocean-2d-ht.nc
ocean-2d-hu.nc
ocean-2d-ice_calving-1-monthly-_1958_01_16.nc
ocean-2d-ice_calving-1-monthly-_1958_02_15.nc
ocean-2d-ice_calving-1-monthly-_1958_03_16.nc
ocean-2d-kmt.nc
ocean-2d-kmu.nc
ocean-2d-melt-1-monthly-_1958_01_16.nc
ocean-2d-melt-1-monthly-_1958_02_15.nc
ocean-2d-melt-1-monthly-_1958_03_16.nc
ocean-2d-mld-1-monthly-_1958_01_16.nc
ocean-2d-mld-1-monthly-_1958_02_15.nc
ocean-2d-mld-1-monthly-_1958_03_16.nc
ocean-2d-pbot_t-1-monthly-_1958_01_16.nc
ocean-2d-pbot_t-1-monthly-_1958_02_15.nc
ocean-2d-pbot_t-1-monthly-_1958_03_16.nc
ocean-2d-pme_river-1-monthly-_1958_01_16.nc
ocean-2d-pme_river-1-monthly-_1958_02_15.nc
ocean-2d-pme_river-1-monthly-_1958_03_16.nc
ocean-2d-river-1-monthly-_1958_01_16.nc
ocean-2d-river-1-monthly-_1958_02_15.nc
ocean-2d-river-1-monthly-_1958_03_16.nc
ocean-2d-runoff-1-monthly-_1958_01_16.nc
ocean-2d-runoff-1-monthly-_1958_02_15.nc
ocean-2d-runoff-1-monthly-_1958_03_16.nc
ocean-2d-sea_level-1-monthly-_1958_01_16.nc
ocean-2d-sea_level-1-monthly-_1958_02_15.nc
ocean-2d-sea_level-1-monthly-_1958_03_16.nc
ocean-2d-sea_level_sq-1-monthly-_1958_01_16.nc
ocean-2d-sea_level_sq-1-monthly-_1958_02_15.nc
ocean-2d-sea_level_sq-1-monthly-_1958_03_16.nc
ocean-2d-sfc_salt_flux_coupler-1-monthly-_1958_01_16.nc
ocean-2d-sfc_salt_flux_coupler-1-monthly-_1958_02_15.nc
ocean-2d-sfc_salt_flux_coupler-1-monthly-_1958_03_16.nc
ocean-2d-sfc_salt_flux_ice-1-monthly-_1958_01_16.nc
ocean-2d-sfc_salt_flux_ice-1-monthly-_1958_02_15.nc
ocean-2d-sfc_salt_flux_ice-1-monthly-_1958_03_16.nc
ocean-2d-sfc_salt_flux_restore-1-monthly-_1958_01_16.nc
ocean-2d-sfc_salt_flux_restore-1-monthly-_1958_02_15.nc
ocean-2d-sfc_salt_flux_restore-1-monthly-_1958_03_16.nc
ocean-2d-tau_x-1-monthly-_1958_01_16.nc
ocean-2d-tau_x-1-monthly-_1958_02_15.nc
ocean-2d-tau_x-1-monthly-_1958_03_16.nc
ocean-2d-tau_y-1-monthly-_1958_01_16.nc
ocean-2d-tau_y-1-monthly-_1958_02_15.nc
ocean-2d-tau_y-1-monthly-_1958_03_16.nc
ocean-2d-tx_trans_int_z-1-monthly-_1958_01_16.nc
ocean-2d-tx_trans_int_z-1-monthly-_1958_02_15.nc
ocean-2d-tx_trans_int_z-1-monthly-_1958_03_16.nc
ocean-2d-ty_trans_int_z-1-monthly-_1958_01_16.nc
ocean-2d-ty_trans_int_z-1-monthly-_1958_02_15.nc
ocean-2d-ty_trans_int_z-1-monthly-_1958_03_16.nc
ocean-3d-age_global-1-monthly-_1958_01_16.nc
ocean-3d-age_global-1-monthly-_1958_02_15.nc
ocean-3d-age_global-1-monthly-_1958_03_16.nc
ocean-3d-diff_cbt_t-1-monthly-_1958_01_16.nc
ocean-3d-diff_cbt_t-1-monthly-_1958_02_15.nc
ocean-3d-diff_cbt_t-1-monthly-_1958_03_16.nc
ocean-3d-dzt-1-monthly-_1958_01_16.nc
ocean-3d-dzt-1-monthly-_1958_02_15.nc
ocean-3d-dzt-1-monthly-_1958_03_16.nc
ocean-3d-pot_rho_0-1-monthly-_1958_01_16.nc
ocean-3d-pot_rho_0-1-monthly-_1958_02_15.nc
ocean-3d-pot_rho_0-1-monthly-_1958_03_16.nc
ocean-3d-pot_rho_2-1-monthly-_1958_01_16.nc
ocean-3d-pot_rho_2-1-monthly-_1958_02_15.nc
ocean-3d-pot_rho_2-1-monthly-_1958_03_16.nc
ocean-3d-pot_temp-1-monthly-_1958_01_16.nc
ocean-3d-pot_temp-1-monthly-_1958_02_15.nc
ocean-3d-pot_temp-1-monthly-_1958_03_16.nc
ocean-3d-salt-1-monthly-_1958_01_16.nc
ocean-3d-salt-1-monthly-_1958_02_15.nc
ocean-3d-salt-1-monthly-_1958_03_16.nc
ocean-3d-temp-1-monthly-_1958_01_16.nc
ocean-3d-temp-1-monthly-_1958_02_15.nc
ocean-3d-temp-1-monthly-_1958_03_16.nc
ocean-3d-temp_xflux_adv-1-monthly-_1958_01_16.nc
ocean-3d-temp_xflux_adv-1-monthly-_1958_02_15.nc
ocean-3d-temp_xflux_adv-1-monthly-_1958_03_16.nc
ocean-3d-temp_yflux_adv-1-monthly-_1958_01_16.nc
ocean-3d-temp_yflux_adv-1-monthly-_1958_02_15.nc
ocean-3d-temp_yflux_adv-1-monthly-_1958_03_16.nc
ocean-3d-tx_trans-1-monthly-_1958_01_16.nc
ocean-3d-tx_trans-1-monthly-_1958_02_15.nc
ocean-3d-tx_trans-1-monthly-_1958_03_16.nc
ocean-3d-tx_trans_rho-1-monthly-_1958_01_16.nc
ocean-3d-tx_trans_rho-1-monthly-_1958_02_15.nc
ocean-3d-tx_trans_rho-1-monthly-_1958_03_16.nc
ocean-3d-ty_trans-1-monthly-_1958_01_16.nc
ocean-3d-ty_trans-1-monthly-_1958_02_15.nc
ocean-3d-ty_trans-1-monthly-_1958_03_16.nc
ocean-3d-ty_trans_rho-1-monthly-_1958_01_16.nc
ocean-3d-ty_trans_rho-1-monthly-_1958_02_15.nc
ocean-3d-ty_trans_rho-1-monthly-_1958_03_16.nc
ocean-3d-ty_trans_rho_gm-1-monthly-_1958_01_16.nc
ocean-3d-ty_trans_rho_gm-1-monthly-_1958_02_15.nc
ocean-3d-ty_trans_rho_gm-1-monthly-_1958_03_16.nc
ocean-3d-u-1-monthly-_1958_01_16.nc
ocean-3d-u-1-monthly-_1958_02_15.nc
ocean-3d-u-1-monthly-_1958_03_16.nc
ocean-3d-v-1-monthly-_1958_01_16.nc
ocean-3d-v-1-monthly-_1958_02_15.nc
ocean-3d-v-1-monthly-_1958_03_16.nc
ocean-3d-wt-1-monthly-_1958_01_16.nc
ocean-3d-wt-1-monthly-_1958_02_15.nc
ocean-3d-wt-1-monthly-_1958_03_16.nc
ocean-scalar-_1958_01_16.nc
ocean-scalar-_1958_02_15.nc
ocean-scalar-_1958_03_16.nc

aidanheerdegen · 2020-05-22T01:28:31Z

-_ is pretty horrible.

You could add a prefix to the date field. date_1958_01_16 is quite long, so if there was something else that would be good.

russfiedler · 2020-05-22T01:28:37Z

That first form looks atrocious.

aekiss · 2020-05-22T01:34:25Z

maybe ymd_1958_01_16 to specify the date order explicitly? Only one char shorter...

aidanheerdegen · 2020-05-22T01:40:03Z

Yeah all I could come up with was dd. ymd at least has the advantage of some informational value.

russfiedler · 2020-05-22T01:46:12Z

I think judicious use of sed and mv postprocessing can get rid of the offending -_ abomination...

aekiss · 2020-05-22T01:49:27Z

Also we may be unable to put anything after the date part of the filename.

"ocean-3d-temp-1-monthly-%4yr%2mo%2dy-snap", 1, "months", 1, "days", "time", 1, "months"
"ocean_model", "temp", "temp", "ocean-3d-temp-1-monthly-%4yr%2mo%2dy-snap", "all", "none", "none", 2

produces files like
ocean-3d-temp-1-monthly-_1958_05_01.nc
rather than the expected
ocean-3d-temp-1-monthly-_1958_05_01-snap.nc

So we may need to include the reduction method before the date, ie
ocean-3d-temp-1-monthly-snap-_1958_05_01.nc
(in which case we should not hide average so that the number of "fields" is consistent)

aekiss · 2020-05-22T01:51:12Z

We could kill 2 birds with 1 stone my omitting the dash between reduction method and date, e.g.

ocean-3d-temp-1-monthly-average_1958_05_01.nc
ocean-3d-temp-1-monthly-snap_1958_05_01.nc

ie consider the reduction method to be part of the date "field"

aidanheerdegen · 2020-05-22T01:51:40Z

So you could have the reduction method and date in the same field separated by an underscore ... or you could have all the date related stuff in a single field, like 1_monthly_snap_1958_05_01.

It is all completely arbitrary

aidanheerdegen · 2020-05-22T01:51:50Z

snap

aidanheerdegen · 2020-05-22T01:52:39Z

I prefer the latter, including all the date stuff in a single field. Seems consistent and neater.

aekiss · 2020-05-22T11:53:28Z

These approaches don't work for the scalar files which lack a lot of the date stuff, so I think the ymd approach is best.

Examples:

1 file per field for 2d and 3d

ocean-2d-mld-1-monthly-mean-ymd_1958_10_16.nc
ocean-2d-mld-1-monthly-mean-ymd_1958_11_16.nc
ocean-2d-mld-1-monthly-mean-ymd_1958_12_16.nc
ocean-3d-temp-1-monthly-mean-ymd_1958_10_16.nc
ocean-3d-temp-1-monthly-mean-ymd_1958_11_16.nc
ocean-3d-temp-1-monthly-mean-ymd_1958_12_16.nc
ocean-3d-temp-1-monthly-snap-ymd_1958_11_01.nc  # NB: snap
ocean-3d-temp-1-monthly-snap-ymd_1958_12_01.nc
ocean-3d-temp-1-monthly-snap-ymd_1959_01_01.nc

all scalars in one file: edit: see below

ocean-scalar-ymd_1958_10_16.nc
ocean-scalar-ymd_1958_11_16.nc
ocean-scalar-ymd_1958_12_16.nc

static grid data in one file per field, with no date info:

ocean-2d-geolat_c.nc
ocean-2d-geolat_t.nc
ocean-2d-geolon_c.nc
ocean-2d-geolon_t.nc

russfiedler · 2020-05-25T00:52:55Z

Having the day in those monthly files seems completely unintuitive, unnecessary and ugly to me. You've got a 16 there for every file except Feb and that doesn't change in leap years so it serves no purpose.

aekiss · 2020-05-25T02:06:19Z

I agree - I was thinking the same thing.

For snapshots that would mean the month will be at the end of the sampling period (ie the month after the sampling period), and other reduction methods will be in the middle (rounded down).

so monthly sampling over 3 months (Jan-March) looks like this

ocean-3d-temp-1-monthly-mean-ymd_1959_01.nc
ocean-3d-temp-1-monthly-mean-ymd_1959_02.nc
ocean-3d-temp-1-monthly-mean-ymd_1959_03.nc
ocean-3d-temp-1-monthly-snap-ymd_1959_02.nc  # snap offset by 1
ocean-3d-temp-1-monthly-snap-ymd_1959_03.nc
ocean-3d-temp-1-monthly-snap-ymd_1959_04.nc  # after the final month (March)

and 3-monthly is like this

ocean-3d-temp-3-monthly-mean-ymd_1959_02.nc  # in middle month
ocean-3d-temp-3-monthly-snap-ymd_1959_04.nc  # after final month

I guess that's not too confusing.

aidanheerdegen · 2020-05-25T02:09:34Z

At the risk of pedantry, should it be ym for the monthly ones?

russfiedler · 2020-05-25T02:21:05Z

Yes, Maybe, Deprecated.

I think the starting month is preferable for multimonth files since you have a far simpler relationship with the beginning and end of the time period. Besides, what if you have a 2 or 4 month file?

Edit: Ah, this is for a 3 monthly mean not the individual months. The middle does make a lot of sense in that case as it coincides with the time in the file.

aekiss · 2020-05-25T02:39:08Z

@aidanheerdegen agreed - 1 char saved!

@russfiedler this is just the standard behaviour of MOM with %4yr%2mo and both output_freq and new_file_freq = 3 months

russfiedler · 2020-05-25T02:49:24Z

@aekiss Yes, before my edit I was thinking your example was the case output_freq=1 and new_file_freq=3.

aekiss · 2020-05-25T03:13:12Z

with output_freq=1 and new_file_freq=3 we'd get

ocean-3d-temp-1-monthly-mean-ym_1959_01.nc
ocean-3d-temp-1-monthly-snap-ym_1959_02.nc

aekiss · 2020-05-25T03:19:23Z

I think shared scalars file should also include the output frequency (as it's a per-file setting), but omit the reduction method (as it's per-field), eg

ocean-scalar-1-monthly-ym_1959_01.nc

aekiss · 2020-06-10T11:08:25Z

closing - this has now been implemented in the ak-dev branch in all 6 configurations

aekiss added a commit to COSIMA/make_diag_table that referenced this issue May 22, 2020

align diag_table more closely with COSIMA/access-om2#185

11e71d1

aekiss mentioned this issue May 26, 2020

new standard diag_tables for each resolution #203

Open

aekiss closed this as completed Jun 10, 2020

aekiss mentioned this issue Jul 18, 2024

Include frequency in history/diagnostics output filenames COSIMA/access-om3#191

Open

aekiss mentioned this issue Jul 26, 2024

Essential diagnostics COSIMA/access-om3#190

Open

minghangli-uni mentioned this issue Jul 29, 2024

Support multiple separators COSIMA/make_diag_table#7

Closed

aekiss added a commit to COSIMA/make_diag_table that referenced this issue Jul 31, 2024

fix comments - see COSIMA/access-om2#185 (comment)

f6e5606

put date in MOM output filenames #185

put date in MOM output filenames #185

Comments

aekiss commented Jan 21, 2020

aekiss commented Jan 21, 2020

aekiss commented Jan 21, 2020

aekiss commented Jan 21, 2020

russfiedler commented Jan 21, 2020

russfiedler commented Jan 21, 2020

russfiedler commented Jan 21, 2020

aekiss commented Jan 21, 2020

russfiedler commented Jan 21, 2020 • edited Loading

aekiss commented Jan 21, 2020

aekiss commented Jan 21, 2020

angus-g commented Jan 21, 2020

AndyHoggANU commented Jan 21, 2020

aidanheerdegen commented Jan 21, 2020

AndyHoggANU commented Jan 21, 2020

aidanheerdegen commented Jan 21, 2020

russfiedler commented Jan 21, 2020 • edited Loading

AndyHoggANU commented Jan 21, 2020

aidanheerdegen commented Jan 21, 2020

russfiedler commented Jan 21, 2020

russfiedler commented Jan 21, 2020

AndyHoggANU commented Jan 21, 2020

russfiedler commented Jan 21, 2020

AndyHoggANU commented Jan 21, 2020

aidanheerdegen commented Jan 24, 2020

russfiedler commented Jan 24, 2020

aekiss commented Mar 17, 2020 • edited Loading

aekiss commented Mar 17, 2020

angus-g commented Mar 17, 2020

AndyHoggANU commented Mar 17, 2020

aekiss commented May 21, 2020

aekiss commented May 22, 2020

aekiss commented May 22, 2020

aidanheerdegen commented May 22, 2020

russfiedler commented May 22, 2020

aekiss commented May 22, 2020

aidanheerdegen commented May 22, 2020

russfiedler commented May 22, 2020

aekiss commented May 22, 2020

aekiss commented May 22, 2020

aidanheerdegen commented May 22, 2020

aidanheerdegen commented May 22, 2020 • edited Loading

aidanheerdegen commented May 22, 2020

aekiss commented May 22, 2020 • edited Loading

russfiedler commented May 25, 2020

aekiss commented May 25, 2020

aidanheerdegen commented May 25, 2020

russfiedler commented May 25, 2020 • edited Loading

aekiss commented May 25, 2020

russfiedler commented May 25, 2020

aekiss commented May 25, 2020

aekiss commented May 25, 2020

aekiss commented Jun 10, 2020

russfiedler commented Jan 21, 2020 •

edited

Loading

russfiedler commented Jan 21, 2020 •

edited

Loading

aekiss commented Mar 17, 2020 •

edited

Loading

aidanheerdegen commented May 22, 2020 •

edited

Loading

aekiss commented May 22, 2020 •

edited

Loading

russfiedler commented May 25, 2020 •

edited

Loading