Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement lead time coordinate #634

Merged
merged 9 commits into from
Nov 16, 2021
Merged

Conversation

piotr-florek-mohc
Copy link
Collaborator

This is my first go at issue #623. Adding scalar reference time could be achieved just by specifying a value of reftime1 in mip_tables, which can be later overridden during runtime (please note that reftime is a restricted keyword in the cfunits namespace).

Adding leadtime has been achieved by implementing a simple function which creates an auxiliary lead time coordinate, and populates it with differences between time coordinate values and the reftime. Calendars and units should be the same, and the reftime should be outside the bounds of the time coordinate (i.e. must be earlier than the opening bound of the first time step).

I've also added a small tweak to cmor_write() function, to allow me to retrieve the ncid after the write is complete.

cc @matthew-mizielinski

@taylor13
Copy link
Collaborator

taylor13 commented Oct 6, 2021

I noticed in the new coordinates table, the standard_name given for reftime1 is "leadtime", but according to @martinjuckes at #623 (comment) it should be "forecast_period".

"reftime1" in the coordinates table should have "must_have_bounds = "no" and bounds_values should be set to the null string. I think it's also probably a mistake to define a default value for "reftime1", since this will probably vary from one forecast to the next. [Also, although reftime may be a restricted key word in cftime, that shouldn't preclude its use as the identifier for the reftime coordinate variable in CMOR should it? If it shouldn't, then what is the reason for naming it "reftime1" instead of simply "reftime"? (Of course this makes no difference to the output variable name which is outname="reftime".)]

Normally a forecast field will be a synoptically sampled ("instantaneous") meteorological field, in which case "must_have_bounds" should be set to "no" for leadtime. If some of the forecast fields should be accumulated or "time-means" and others will be instantaneous snapshots, then 2 different "leadtime" coordinates should be defined (e.g. analogous to "time" and "time1").

Does the presence of "reftime1" in the list of "coordinates" appearing with a variable in the amon.json file trigger CMOR to create leadtime and store both leadtime and reftime in the file (and include both of these variables in the netcdf "coordinates" attribute list?

Where is the test code that is used to check whether the changes produce correct output (sorry, my familiarity with pull requests is still weak). I'd like to review it. Also, could you post somewhere an ncdump of the first part of the file produced by the test code?

@piotr-florek-mohc
Copy link
Collaborator Author

Hi Karl,
I have updated the pull request with small tweaks to the test file and mip tables. My (very ridimentary) test script is in Test/test_leadtime_coord.py, and it produces the following output

netcdf tas_Amon_HadGEM3-GC31-LL_amip_r1i1p1f3_gn_200002-200003 {
dimensions:
	time = UNLIMITED ; // (2 currently)
	lat = 5 ;
	lon = 5 ;
	bnds = 2 ;
variables:
	double time(time) ;
		time:bounds = "time_bnds" ;
		time:units = "days since 2000-01-01" ;
		time:calendar = "360_day" ;
		time:axis = "T" ;
		time:long_name = "time" ;
		time:standard_name = "time" ;
	double time_bnds(time, bnds) ;
	double lat(lat) ;
		lat:bounds = "lat_bnds" ;
		lat:units = "degrees_north" ;
		lat:axis = "Y" ;
		lat:long_name = "Latitude" ;
		lat:standard_name = "latitude" ;
	double lat_bnds(lat, bnds) ;
	double lon(lon) ;
		lon:bounds = "lon_bnds" ;
		lon:units = "degrees_east" ;
		lon:axis = "X" ;
		lon:long_name = "Longitude" ;
		lon:standard_name = "longitude" ;
	double lon_bnds(lon, bnds) ;
	double reftime ;
		reftime:units = "days since 2000-01-01" ;
		reftime:calendar = "360_day" ;
		reftime:long_name = "Start date of the forecast" ;
		reftime:standard_name = "forecast_reference_time" ;
	double height ;
		height:units = "m" ;
		height:axis = "Z" ;
		height:positive = "up" ;
		height:long_name = "height" ;
		height:standard_name = "height" ;
	float tas(time, lat, lon) ;
		tas:standard_name = "air_temperature" ;
		tas:long_name = "Near-Surface Air Temperature" ;
		tas:comment = "near-surface (usually, 2 meter) air temperature" ;
		tas:units = "K" ;
		tas:cell_methods = "area: time: mean" ;
		tas:cell_measures = "area: areacella" ;
		tas:history = "2021-10-07T15:15:04Z altered by CMOR: Treated scalar dimension: \'reftime\'. 2021-10-07T15:15:04Z altered by CMOR: Treated scalar dimension: \'height\'. 2021-10-07T15:15:04Z altered by CMOR: Reordered dimensions, original order: lon lat time. 2021-10-07T15:15:04Z altered by CMOR: Converted type from \'d\' to \'f\'." ;
		tas:missing_value = 1.e+20f ;
		tas:_FillValue = 1.e+20f ;
		tas:coordinates = "reftime height leadtime" ;
	double leadtime(time) ;
		leadtime:axis = "T" ;
		leadtime:units = "days" ;
		leadtime:long_name = "Time elapsed since the start of the forecast" ;
		leadtime:standard_name = "forecast_period" ;
		leadtime:bounds = "leadtime_bnds" ;
	double leadtime_bnds(time, bnds) ;

@piotr-florek-mohc
Copy link
Collaborator Author

piotr-florek-mohc commented Oct 8, 2021

Hi Karl,
apologies, I was meant to answer other questions yesterday but got distracted...

"reftime1" in the coordinates table should have "must_have_bounds = "no" and bounds_values should be set to the null string. I think it's also probably a mistake to define a default value for "reftime1", since this will probably vary from one forecast to the next.

This is just a placeholder value, as it's what triggers coordinate's interpretation as a scalar in CMOR's code. I think it would require much more work to change this behaviour, and make it backwards compatible at the same time, although I could probably add both reftime and leadtime in one go, as a separate API call.

I have removed bounds from the CV_coordinates file (sorry, missed that one yesterday).

[Also, although reftime may be a restricted key word in cftime, that shouldn't preclude its use as the identifier for the reftime coordinate variable in CMOR should it? If it shouldn't, then what is the reason for naming it "reftime1" instead of simply "reftime"?

The cftime issue was a bit of a conjecture on my part (I investigated it early on, when I wasn't familiar with CMOR's code at all, and I vaguely remember the namespace conflict was present in Python netCDF4). The fact is that CMOR segfaults during writing of the variable to the file (in cmor_write_var_to_file()) when 'reftime' is used, but does not when a different name is provided in mip tables instead. I've tried to debug this behaviour but I still have no clue why this happens.

Normally a forecast field will be a synoptically sampled ("instantaneous") meteorological field, in which case "must_have_bounds" should be set to "no" for leadtime. If some of the forecast fields should be accumulated or "time-means" and others will be instantaneous snapshots, then 2 different "leadtime" coordinates should be defined (e.g. analogous to "time" and "time1").

Ahh I see. Okay, I will need to have a think how to resolve this.

Does the presence of "reftime1" in the list of "coordinates" appearing with a variable in the amon.json file trigger CMOR to create leadtime and store both leadtime and reftime in the file (and include both of these variables in the netcdf "coordinates" attribute list?

Only if the new function (calculate_leadtime_coord) is explicitly called after reftime was created.

@mauzey1
Copy link
Collaborator

mauzey1 commented Oct 29, 2021

@piotr-florek-mohc You will want to merge the latest from master to remove the Python 3.6 tests since we are no longer supporting Python 3.6 in an upcoming release.

@piotr-florek-mohc
Copy link
Collaborator Author

thanks @mauzey1, that has been done

@mauzey1
Copy link
Collaborator

mauzey1 commented Nov 3, 2021

@piotr-florek-mohc I noticed that you included a Python test but it is not part of the continuous integration test suite. Please add the following line to the file Makefile.in under the test_python: python section:

test_python: python
	env TEST_NAME=Test/test_python_forecast_coordinates.py make test_a_python

I tried running the test but it experienced an error.

Traceback (most recent call last):
  File "Test/test_python_forecast_coordinates.py", line 64, in setUp
    table_id = cmor.load_table(mip_table)
  File "/Users/mauzey1/opt/anaconda3/envs/cmor_dev/lib/python3.8/site-packages/CMOR-3.6.1-py3.8-macosx-10.9-x86_64.egg/cmor/pywrapper.py", line 916, in load_table
    return _cmor.load_table(table)
_cmor.CMORError: Problem with 'cmor.load_table'. Please check the logfile (if defined).

@piotr-florek-mohc
Copy link
Collaborator Author

Thanks @mauzey1, I missed the fact that I hadn't put my test CV under version control...

@piotr-florek-mohc
Copy link
Collaborator Author

Hi @mauzey1 would you be able to briefly take a look and check how sensible the solution is, and if it requires major reworking? Many thanks!

@mauzey1
Copy link
Collaborator

mauzey1 commented Nov 15, 2021

@piotr-florek-mohc

I think these changes are reasonable. However, I would like to have a new entry in the documentation for the calculate_leadtime_coord function. Please create a documentation entry similar to the ones on the CMOR API doc page. If you could, please create a pull request to add the change using the doc's repo. Thank you.

@durack1 @taylor13 Are you fine with these changes? Do you want me to perform any tests before merging?

@durack1
Copy link
Contributor

durack1 commented Nov 15, 2021

@taylor13 over to you..

@taylor13
Copy link
Collaborator

@piotr-florek-mohc - I wonder if you have tested this version as part of your regular job stream in post-processing CMIP6 output (including cases when the new parts of the code are not needed). If not, do you think such a test is needed to be safe, or are you confident that the coding you've included can't disrupt the older code. (I know you settled on an implementation strategy that would minimize possible impact, so perhaps the answer is yes, but please confirm.) Thanks.

@mauzey1
Copy link
Collaborator

mauzey1 commented Nov 16, 2021

@taylor13 All of the current tests are passing, and the leadtime and reftime axes only appear in the file for the newly-added test. I think it's safe to merge.

@taylor13
Copy link
Collaborator

That's great. I'm happy.

@mauzey1 mauzey1 merged commit 0748f2c into master Nov 16, 2021
@mauzey1 mauzey1 mentioned this pull request Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants