implement lead time coordinate #634

piotr-florek-mohc · 2021-10-06T13:22:10Z

This is my first go at issue #623. Adding scalar reference time could be achieved just by specifying a value of reftime1 in mip_tables, which can be later overridden during runtime (please note that reftime is a restricted keyword in the cfunits namespace).

Adding leadtime has been achieved by implementing a simple function which creates an auxiliary lead time coordinate, and populates it with differences between time coordinate values and the reftime. Calendars and units should be the same, and the reftime should be outside the bounds of the time coordinate (i.e. must be earlier than the opening bound of the first time step).

I've also added a small tweak to cmor_write() function, to allow me to retrieve the ncid after the write is complete.

cc @matthew-mizielinski

taylor13 · 2021-10-06T18:47:55Z

I noticed in the new coordinates table, the standard_name given for reftime1 is "leadtime", but according to @martinjuckes at #623 (comment) it should be "forecast_period".

"reftime1" in the coordinates table should have "must_have_bounds = "no" and bounds_values should be set to the null string. I think it's also probably a mistake to define a default value for "reftime1", since this will probably vary from one forecast to the next. [Also, although reftime may be a restricted key word in cftime, that shouldn't preclude its use as the identifier for the reftime coordinate variable in CMOR should it? If it shouldn't, then what is the reason for naming it "reftime1" instead of simply "reftime"? (Of course this makes no difference to the output variable name which is outname="reftime".)]

Normally a forecast field will be a synoptically sampled ("instantaneous") meteorological field, in which case "must_have_bounds" should be set to "no" for leadtime. If some of the forecast fields should be accumulated or "time-means" and others will be instantaneous snapshots, then 2 different "leadtime" coordinates should be defined (e.g. analogous to "time" and "time1").

Does the presence of "reftime1" in the list of "coordinates" appearing with a variable in the amon.json file trigger CMOR to create leadtime and store both leadtime and reftime in the file (and include both of these variables in the netcdf "coordinates" attribute list?

Where is the test code that is used to check whether the changes produce correct output (sorry, my familiarity with pull requests is still weak). I'd like to review it. Also, could you post somewhere an ncdump of the first part of the file produced by the test code?

piotr-florek-mohc · 2021-10-07T15:33:46Z

Hi Karl,
I have updated the pull request with small tweaks to the test file and mip tables. My (very ridimentary) test script is in Test/test_leadtime_coord.py, and it produces the following output

netcdf tas_Amon_HadGEM3-GC31-LL_amip_r1i1p1f3_gn_200002-200003 {
dimensions:
	time = UNLIMITED ; // (2 currently)
	lat = 5 ;
	lon = 5 ;
	bnds = 2 ;
variables:
	double time(time) ;
		time:bounds = "time_bnds" ;
		time:units = "days since 2000-01-01" ;
		time:calendar = "360_day" ;
		time:axis = "T" ;
		time:long_name = "time" ;
		time:standard_name = "time" ;
	double time_bnds(time, bnds) ;
	double lat(lat) ;
		lat:bounds = "lat_bnds" ;
		lat:units = "degrees_north" ;
		lat:axis = "Y" ;
		lat:long_name = "Latitude" ;
		lat:standard_name = "latitude" ;
	double lat_bnds(lat, bnds) ;
	double lon(lon) ;
		lon:bounds = "lon_bnds" ;
		lon:units = "degrees_east" ;
		lon:axis = "X" ;
		lon:long_name = "Longitude" ;
		lon:standard_name = "longitude" ;
	double lon_bnds(lon, bnds) ;
	double reftime ;
		reftime:units = "days since 2000-01-01" ;
		reftime:calendar = "360_day" ;
		reftime:long_name = "Start date of the forecast" ;
		reftime:standard_name = "forecast_reference_time" ;
	double height ;
		height:units = "m" ;
		height:axis = "Z" ;
		height:positive = "up" ;
		height:long_name = "height" ;
		height:standard_name = "height" ;
	float tas(time, lat, lon) ;
		tas:standard_name = "air_temperature" ;
		tas:long_name = "Near-Surface Air Temperature" ;
		tas:comment = "near-surface (usually, 2 meter) air temperature" ;
		tas:units = "K" ;
		tas:cell_methods = "area: time: mean" ;
		tas:cell_measures = "area: areacella" ;
		tas:history = "2021-10-07T15:15:04Z altered by CMOR: Treated scalar dimension: \'reftime\'. 2021-10-07T15:15:04Z altered by CMOR: Treated scalar dimension: \'height\'. 2021-10-07T15:15:04Z altered by CMOR: Reordered dimensions, original order: lon lat time. 2021-10-07T15:15:04Z altered by CMOR: Converted type from \'d\' to \'f\'." ;
		tas:missing_value = 1.e+20f ;
		tas:_FillValue = 1.e+20f ;
		tas:coordinates = "reftime height leadtime" ;
	double leadtime(time) ;
		leadtime:axis = "T" ;
		leadtime:units = "days" ;
		leadtime:long_name = "Time elapsed since the start of the forecast" ;
		leadtime:standard_name = "forecast_period" ;
		leadtime:bounds = "leadtime_bnds" ;
	double leadtime_bnds(time, bnds) ;

piotr-florek-mohc · 2021-10-08T09:35:05Z

Hi Karl,
apologies, I was meant to answer other questions yesterday but got distracted...

"reftime1" in the coordinates table should have "must_have_bounds = "no" and bounds_values should be set to the null string. I think it's also probably a mistake to define a default value for "reftime1", since this will probably vary from one forecast to the next.

This is just a placeholder value, as it's what triggers coordinate's interpretation as a scalar in CMOR's code. I think it would require much more work to change this behaviour, and make it backwards compatible at the same time, although I could probably add both reftime and leadtime in one go, as a separate API call.

I have removed bounds from the CV_coordinates file (sorry, missed that one yesterday).

[Also, although reftime may be a restricted key word in cftime, that shouldn't preclude its use as the identifier for the reftime coordinate variable in CMOR should it? If it shouldn't, then what is the reason for naming it "reftime1" instead of simply "reftime"?

The cftime issue was a bit of a conjecture on my part (I investigated it early on, when I wasn't familiar with CMOR's code at all, and I vaguely remember the namespace conflict was present in Python netCDF4). The fact is that CMOR segfaults during writing of the variable to the file (in cmor_write_var_to_file()) when 'reftime' is used, but does not when a different name is provided in mip tables instead. I've tried to debug this behaviour but I still have no clue why this happens.

Normally a forecast field will be a synoptically sampled ("instantaneous") meteorological field, in which case "must_have_bounds" should be set to "no" for leadtime. If some of the forecast fields should be accumulated or "time-means" and others will be instantaneous snapshots, then 2 different "leadtime" coordinates should be defined (e.g. analogous to "time" and "time1").

Ahh I see. Okay, I will need to have a think how to resolve this.

Does the presence of "reftime1" in the list of "coordinates" appearing with a variable in the amon.json file trigger CMOR to create leadtime and store both leadtime and reftime in the file (and include both of these variables in the netcdf "coordinates" attribute list?

Only if the new function (calculate_leadtime_coord) is explicitly called after reftime was created.

…te mip table

…heck

mauzey1 · 2021-10-29T21:18:38Z

@piotr-florek-mohc You will want to merge the latest from master to remove the Python 3.6 tests since we are no longer supporting Python 3.6 in an upcoming release.

…_leadtime_coord

piotr-florek-mohc · 2021-11-02T13:37:43Z

thanks @mauzey1, that has been done

mauzey1 · 2021-11-03T02:33:31Z

@piotr-florek-mohc I noticed that you included a Python test but it is not part of the continuous integration test suite. Please add the following line to the file Makefile.in under the test_python: python section:

test_python: python
	env TEST_NAME=Test/test_python_forecast_coordinates.py make test_a_python

I tried running the test but it experienced an error.

Traceback (most recent call last):
  File "Test/test_python_forecast_coordinates.py", line 64, in setUp
    table_id = cmor.load_table(mip_table)
  File "/Users/mauzey1/opt/anaconda3/envs/cmor_dev/lib/python3.8/site-packages/CMOR-3.6.1-py3.8-macosx-10.9-x86_64.egg/cmor/pywrapper.py", line 916, in load_table
    return _cmor.load_table(table)
_cmor.CMORError: Problem with 'cmor.load_table'. Please check the logfile (if defined).

piotr-florek-mohc · 2021-11-03T12:38:47Z

Thanks @mauzey1, I missed the fact that I hadn't put my test CV under version control...

piotr-florek-mohc · 2021-11-12T13:35:50Z

Hi @mauzey1 would you be able to briefly take a look and check how sensible the solution is, and if it requires major reworking? Many thanks!

mauzey1 · 2021-11-15T18:38:37Z

@piotr-florek-mohc

I think these changes are reasonable. However, I would like to have a new entry in the documentation for the calculate_leadtime_coord function. Please create a documentation entry similar to the ones on the CMOR API doc page. If you could, please create a pull request to add the change using the doc's repo. Thank you.

@durack1 @taylor13 Are you fine with these changes? Do you want me to perform any tests before merging?

durack1 · 2021-11-15T20:49:56Z

@taylor13 over to you..

taylor13 · 2021-11-16T19:19:26Z

@piotr-florek-mohc - I wonder if you have tested this version as part of your regular job stream in post-processing CMIP6 output (including cases when the new parts of the code are not needed). If not, do you think such a test is needed to be safe, or are you confident that the coding you've included can't disrupt the older code. (I know you settled on an implementation strategy that would minimize possible impact, so perhaps the answer is yes, but please confirm.) Thanks.

mauzey1 · 2021-11-16T21:38:20Z

@taylor13 All of the current tests are passing, and the leadtime and reftime axes only appear in the file for the newly-added test. I think it's safe to merge.

taylor13 · 2021-11-16T22:05:55Z

That's great. I'm happy.

leadtime implementation

108a854

updated tests

71977ea

remove bounds from reftime1

24c7ff6

truncate test mip tables

80e2afd

piotrflorek added 3 commits October 29, 2021 10:36

internalize leadtime creation, implemented forecast flags in coordina…

1de6cd9

…te mip table

compatibility fixes, call leadtime function after vars have been created

f03fb70

integrate python test with make test_python, reimplement dimensions c…

6fcb12d

…heck

Merge remote-tracking branch 'origin/master' into issue_623_implement…

775d9cb

…_leadtime_coord

fix failing python tests

847179f

mauzey1 approved these changes Nov 12, 2021

View reviewed changes

mauzey1 merged commit 0748f2c into master Nov 16, 2021

mauzey1 mentioned this pull request Nov 30, 2021

enhancement for forecast data #623

Closed

mauzey1 mentioned this pull request Aug 18, 2022

CMOR 3.7.0 #668

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement lead time coordinate #634

implement lead time coordinate #634

piotr-florek-mohc commented Oct 6, 2021

taylor13 commented Oct 6, 2021

piotr-florek-mohc commented Oct 7, 2021

piotr-florek-mohc commented Oct 8, 2021 •

edited

Loading

mauzey1 commented Oct 29, 2021

piotr-florek-mohc commented Nov 2, 2021

mauzey1 commented Nov 3, 2021

piotr-florek-mohc commented Nov 3, 2021

piotr-florek-mohc commented Nov 12, 2021

mauzey1 commented Nov 15, 2021

durack1 commented Nov 15, 2021

taylor13 commented Nov 16, 2021

mauzey1 commented Nov 16, 2021

taylor13 commented Nov 16, 2021

implement lead time coordinate #634

implement lead time coordinate #634

Conversation

piotr-florek-mohc commented Oct 6, 2021

taylor13 commented Oct 6, 2021

piotr-florek-mohc commented Oct 7, 2021

piotr-florek-mohc commented Oct 8, 2021 • edited Loading

mauzey1 commented Oct 29, 2021

piotr-florek-mohc commented Nov 2, 2021

mauzey1 commented Nov 3, 2021

piotr-florek-mohc commented Nov 3, 2021

piotr-florek-mohc commented Nov 12, 2021

mauzey1 commented Nov 15, 2021

durack1 commented Nov 15, 2021

taylor13 commented Nov 16, 2021

mauzey1 commented Nov 16, 2021

taylor13 commented Nov 16, 2021

piotr-florek-mohc commented Oct 8, 2021 •

edited

Loading