Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loading cubes with differing units for time and time_bounds #1801

Closed
andreas-h opened this issue Sep 29, 2015 · 7 comments · Fixed by #5746
Closed

loading cubes with differing units for time and time_bounds #1801

andreas-h opened this issue Sep 29, 2015 · 7 comments · Fixed by #5746

Comments

@andreas-h
Copy link
Contributor

I have a .nc file where the time variable has different units from time_bnds:

ncdump -h 2008001.nc
netcdf \2008001 {
dimensions:
    bnds = 2 ;
    time = 1 ;
    longitude = 7200 ;
    latitude = 3600 ;
    wavelength = 1 ;
variables:
    int bnds(bnds) ;
    double time_bnds(bnds) ;
        time_bnds:units = "days since 2008-01-01 00:00:00" ;
        time_bnds:calendar = "proleptic_gregorian" ;
    double time(time) ;
        time:bounds = "time_bnds" ;
        time:standard_name = "time" ;
        time:units = "days since 2008-01-09 00:00:00" ;
        time:calendar = "proleptic_gregorian" ;

When I load this cube in Iris 1.8.1, the time coord is wrong:

In [3]: c = iris.load('2008001.nc')[0]

In [4]: c.coord('time')
Out[4]: DimCoord(array([ 0.]), bounds=array([[  0.,  16.]]), standard_name=u'time', units=Unit('days since 2008-01-09 00:00:00', calendar='proleptic_gregorian'), var_name='time')

In [5]: print c.coord('time')
DimCoord([2008-01-09 00:00:00], bounds=[[2008-01-09 00:00:00, 2008-01-25 00:00:00]], standard_name=u'time', calendar=u'proleptic_gregorian', var_name='time')

It seems like the units of the time_bnds variable are applied to both time and time_bnds points.

@pelson
Copy link
Member

pelson commented Sep 29, 2015

Hi @andreas-h. Interesting. The CF spec states:

7.1. Cell Boundaries
To represent cells we add the attribute bounds to the appropriate coordinate variable(s). The value of bounds is the name of the variable that contains the vertices of the cell boundaries. We refer to this type of variable as a "boundary variable." A boundary variable will have one more dimension than its associated coordinate or auxiliary coordinate variable. The additional dimension should be the most rapidly varying one, and its size is the maximum number of cell vertices. Since a boundary variable is considered to be part of a coordinate variable's metadata, it is not necessary to provide it with attributes such as long_name and units .

So whilst it doesn't explicitly rule them out (the word "necessary" is ambiguous), it is not a usecase we've needed to handle before. Is this something that is a one-off? If so, I think you should be able to use a callback to convert the bound units. Something like:

def normalise_time_bnds_units(cube, variable, fname):
    time_bnds = variable.cf_group['time'].cf_group['time_bnds']

    bnd_units = iris.unit.Unit(time_bnds.units)
    t = cube.coord('time')
    # Convert to the correct units.
    t.bounds = t.units.date2num(bnd_units.num2date(t.bounds))

cubes = iris.load(fname, callback=normalise_time_bnds_units)

@andreas-h
Copy link
Contributor Author

thanks, @pelson

Yes, it's a one-off thing, and actually, I forgot the way to prevent this issue from occurring in the first place (in case you're interested, see pydata/xarray#540).

@andreas-h
Copy link
Contributor Author

maybe raising a warning/exception in iris when the units of a coord and its bounds don't match would be useful, though.

@rcomer
Copy link
Member

rcomer commented Jul 16, 2021

This situation does now seem to have been ruled out in the CF conventions:

Boundary variable attributes which determine the coordinate type (units, standard_name, axis and positive) or those which affect the interpretation of the array values (units, calendar, leap_month, leap_year and month_lengths) must always agree exactly with the same attributes of its associated coordinate, scalar coordinate or auxiliary coordinate variable. To avoid duplication, however, it is recommended that these are not provided to a boundary variable.

@trexfeathers
Copy link
Contributor

trexfeathers commented Aug 24, 2022

This situation does now seem to have been ruled out in the CF conventions:

Boundary variable attributes which determine the coordinate type (units, standard_name, axis and positive) or those which affect the interpretation of the array values (units, calendar, leap_month, leap_year and month_lengths) must always agree exactly with the same attributes of its associated coordinate, scalar coordinate or auxiliary coordinate variable. To avoid duplication, however, it is recommended that these are not provided to a boundary variable.

Thanks @rcomer! Based on this, @SciTools/peloton have agreed that we should raise an Exception during loading if the bounds units differ from the coordinate units.

Perhaps ditto with other bounds attributes, too.

@pp-mo
Copy link
Member

pp-mo commented Nov 23, 2022

@SciTools/peloton think this could be tolerant.
If you can convert_units() then do that, otherwise it really would be an error

Similar idea mentioned in #5020

@stephenworsley
Copy link
Contributor

We'll aim to get this into 3.8 with the approach suggested by @pp-mo along with a warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🏁 Done
Status: 🏁 Done
Status: 📋 Backlog
Development

Successfully merging a pull request may close this issue.

8 participants