Skip to content
This repository has been archived by the owner on Apr 30, 2021. It is now read-only.

encoding of time and time_bounds differs in compute_ann_mean results for decode_time=True #111

Closed
klindsay28 opened this issue Apr 8, 2019 · 10 comments

Comments

@klindsay28
Copy link

If I use xr.open_dataset with decode_times=True (the default) to open a dataset ds, then the values of both ds.time and ds[tb_name] are converted to cftime objects (tb_name=ds.time.attrs['bounds']). If I execute
ds_ann = esmlab.climatology.compute_ann_mean(ds),
then the values of ds.time are also cftime objects, but the values of ds[tb_name] are not.
Is this difference intended? I find it confusing.

@andersy005
Copy link
Contributor

@klindsay28, We've made a couple of changes in the last two weeks. And as far I can tell, the issue you pointed out is among some of the inconsistencies we addressed.

  • From this line ds_ann = esmlab.climatology.compute_ann_mean(ds), I can tell that you are not using the master branch of esmlab. If you are using Python 3+, do you mind installing esmlab from master branch

    pip install git+https://github.com/NCAR/esmlab.git

    or by cloning the repo and using pip install -e .?

  • To avoid confusion, we also changed the nomenclature of functions that used to be in the climatology.py module. There's an ongoing discussion in time operations where time_bounds span multiple averaging periods #55 (comment) about this nomenclature issue. Feel free to chime in.

To compute, the annual mean, you will need to use:

ds_ann = esmlab.resample(ds, freq='ann)

instead of

ds_ann = esmlab.climatology.compute_ann_mean(ds)

Sorry for the inconveniences and confusion!

@klindsay28
Copy link
Author

@andersy005, in the thread at #55 (comment), you describe using esmlab.compute_ann_mean(ds), but here you are suggesting esmlab.resample(ds, freq='ann'). From an esmlab point of view, is there a reason to prefer one of these over the other?

@andersy005
Copy link
Contributor

That was before we made the final conclusion last Thursday. When @kmpaul, @matt-long, @jukent and I met, we recognized that to avoid unnecessary code duplication and confusion to the user, the functions could be put into three main categories:

  • resample
  • climatology
  • anomaly

https://github.com/NCAR/esmlab/blob/master/esmlab/core.py

@klindsay28
Copy link
Author

When I update to the latest esmlab and replace esmlab.climatology.compute_ann_mean(ds) with esmlab.resample(ds, freq='ann'), I now get the following error message from xarray:

NotImplementedError: Resample is currently not supported along a dimension indexed by a CFTimeIndex. For certain kinds of downsampling it may be possible to work around this by converting your time index to a DatetimeIndex using CFTimeIndex.to_datetimeindex. Use caution when doing this however, because switching to a DatetimeIndex from a CFTimeIndex with a non-standard calendar entails a change in the calendar type, which could lead to subtle and silent errors.

@andersy005
Copy link
Contributor

The error is due to an old version of xarray. It seems like your xarray version is <0.12. You will need to upgrade to the latest version of xarray with:

pip install xarray --upgrade

@klindsay28
Copy link
Author

esmlab's requirements.txt contains the line

xarray>=0.11.2

Is this incorrect?

I have 0.11.3, the most recent version of xarray that is available in conda.

@klindsay28
Copy link
Author

Okay, I updated xarray to 0.12.1 using pip, and am past the NotImplementedError listed above, and am on the latest esmlab, updating ds_ann=esmlab.climatology.compute_ann_mean(ds) to ds_ann=esmlab.resample(freq='ann').

However, I still am seeing the behavior of this issue that the values of ds_ann.time are cftime objects, while the values of ds[tb_name] are not. It doesn't look like the recent developments address this.

@andersy005
Copy link
Contributor

@klindsay28, when I executed the code below, I can confirm that my time and time_bound are both decoded:

In [1]: import esmlab

In [2]: ds = esmlab.datasets.open_dataset('cesm_cice_daily')

In [3]: ds
Out[3]:
<xarray.Dataset>
Dimensions:      (d2: 2, nc: 5, ni: 6, nj: 6, nkbio: 5, nkice: 8, nksnow: 3, nvertices: 4, time: 365)
Coordinates:
    TLON         (nj, ni) float32 ...
    TLAT         (nj, ni) float32 ...
    ULON         (nj, ni) float32 ...
    ULAT         (nj, ni) float32 ...
    NCAT         (nc) float32 ...
  * time         (time) object 0061-01-02 00:00:00 ... 0062-01-01 00:00:00
Dimensions without coordinates: d2, nc, ni, nj, nkbio, nkice, nksnow, nvertices
Data variables:
    VGRDi        (nkice) float32 ...
    VGRDs        (nksnow) float32 ...
    VGRDb        (nkbio) float32 ...
    tmask        (nj, ni) float32 ...
    tarea        (nj, ni) float32 ...
    uarea        (nj, ni) float32 ...
    dxt          (nj, ni) float32 ...
    dyt          (nj, ni) float32 ...
    dxu          (nj, ni) float32 ...
    dyu          (nj, ni) float32 ...
    HTN          (nj, ni) float32 ...
    HTE          (nj, ni) float32 ...
    ANGLE        (nj, ni) float32 ...
    ANGLET       (nj, ni) float32 ...
    lont_bounds  (nj, ni, nvertices) float32 ...
    latt_bounds  (nj, ni, nvertices) float32 ...
    lonu_bounds  (nj, ni, nvertices) float32 ...
    latu_bounds  (nj, ni, nvertices) float32 ...
    time_bounds  (time, d2) object ...
    aicen_d      (time, nc, nj, ni) float32 ...
Attributes:
    title:             b.e21.B1850.f09_g17.CMIP6-piControl.001
    contents:          Diagnostic and Prognostic Variables
    source:            Los Alamos Sea Ice Model (CICE) Version 5
    time_period_freq:  day_1
    model_doi_url:     https://doi.org/10.5065/D67H1H0V
    comment:           All years have exactly 365 days
    comment2:          File written on model date 00610102
    comment3:          seconds elapsed into model date:      0
    conventions:       CF-1.0
    history:           This dataset was created on 2018-08-12 at 13:23
    io_flavor:         io_pio
    Extracted_from:    /gpfs/fs1/p/cesm/pcwg/timeseries-cmip6/b.e21.B1850.f09...

In [4]: ds.time_bounds
Out[4]:
<xarray.DataArray 'time_bounds' (time: 365, d2: 2)>
array([[cftime.DatetimeNoLeap(61, 1, 1, 0, 0, 0, 0, 5, 1),
        cftime.DatetimeNoLeap(61, 1, 2, 0, 0, 0, 0, 6, 2)],
       [cftime.DatetimeNoLeap(61, 1, 2, 0, 0, 0, 0, 6, 2),
        cftime.DatetimeNoLeap(61, 1, 3, 0, 0, 0, 0, 0, 3)],
       [cftime.DatetimeNoLeap(61, 1, 3, 0, 0, 0, 0, 0, 3),
        cftime.DatetimeNoLeap(61, 1, 4, 0, 0, 0, 0, 1, 4)],
       ...,
       [cftime.DatetimeNoLeap(61, 12, 29, 0, 0, 0, 0, 3, 363),
        cftime.DatetimeNoLeap(61, 12, 30, 0, 0, 0, 0, 4, 364)],
       [cftime.DatetimeNoLeap(61, 12, 30, 0, 0, 0, 0, 4, 364),
        cftime.DatetimeNoLeap(61, 12, 31, 0, 0, 0, 0, 5, 365)],
       [cftime.DatetimeNoLeap(61, 12, 31, 0, 0, 0, 0, 5, 365),
        cftime.DatetimeNoLeap(62, 1, 1, 0, 0, 0, 0, 6, 1)]], dtype=object)
Coordinates:
  * time     (time) object 0061-01-02 00:00:00 ... 0062-01-01 00:00:00
Dimensions without coordinates: d2
Attributes:
    long_name:  boundaries for time-averaging interval

In [5]: ds.time
Out[5]:
<xarray.DataArray 'time' (time: 365)>
array([cftime.DatetimeNoLeap(61, 1, 2, 0, 0, 0, 0, 6, 2),
       cftime.DatetimeNoLeap(61, 1, 3, 0, 0, 0, 0, 0, 3),
       cftime.DatetimeNoLeap(61, 1, 4, 0, 0, 0, 0, 1, 4), ...,
       cftime.DatetimeNoLeap(61, 12, 30, 0, 0, 0, 0, 4, 364),
       cftime.DatetimeNoLeap(61, 12, 31, 0, 0, 0, 0, 5, 365),
       cftime.DatetimeNoLeap(62, 1, 1, 0, 0, 0, 0, 6, 1)], dtype=object)
Coordinates:
  * time     (time) object 0061-01-02 00:00:00 ... 0062-01-01 00:00:00
Attributes:
    long_name:  model time
    bounds:     time_bounds

In [6]: ds_ann = esmlab.resample(ds, freq='ann')
/Users/abanihi/opt/miniconda3/envs/dev/lib/python3.6/site-packages/xarray/core/nanops.py:159: RuntimeWarning: Mean of empty slice
  return np.nanmean(a, axis=axis, dtype=dtype)

In [7]: ds_ann
Out[7]:
<xarray.Dataset>
Dimensions:      (d2: 2, nc: 5, ni: 6, nj: 6, nkbio: 5, nkice: 8, nksnow: 3, nvertices: 4, time: 1)
Coordinates:
  * time         (time) object 0061-07-02 12:00:00
    TLON         (nj, ni) float32 320.5625 321.6875 ... 325.0625 326.1875
    TLAT         (nj, ni) float32 -79.22052 -79.22052 ... -76.54944 -76.54944
    ULON         (nj, ni) float32 321.125 322.25 323.375 ... 325.625 326.75
    ULAT         (nj, ni) float32 -78.952896 -78.952896 ... -76.28169 -76.28169
    NCAT         (nc) float32 0.6445072 1.3914335 2.4701793 4.567288 100000000.0
Dimensions without coordinates: d2, nc, ni, nj, nkbio, nkice, nksnow, nvertices
Data variables:
    time_bounds  (time, d2) float64 2.19e+04 2.226e+04
    aicen_d      (time, nc, nj, ni) float64 nan nan nan ... 7.933e-05 8.486e-05
    VGRDi        (nkice) float32 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0
    VGRDs        (nksnow) float32 1.0 2.0 3.0
    VGRDb        (nkbio) float32 1.0 2.0 3.0 4.0 5.0
    tmask        (nj, ni) float32 0.0 0.0 0.0 0.0 0.0 ... 1.0 1.0 1.0 1.0 1.0
    tarea        (nj, ni) float32 1423619100.0 1423619100.0 ... 1728060700.0
    uarea        (nj, ni) float32 1423489400.0 1423489400.0 ... 1761744800.0
    dxt          (nj, ni) float32 23968.484 23968.484 ... 29094.156 29094.156
    dyt          (nj, ni) float32 59395.453 59395.453 ... 59395.453 59395.453
    dxu          (nj, ni) float32 23966.3 23966.3 ... 29661.271 29661.271
    dyu          (nj, ni) float32 59395.453 59395.453 ... 59395.453 59395.453
    HTN          (nj, ni) float32 23966.3 23966.3 ... 29661.271 29661.271
    HTE          (nj, ni) float32 59395.453 59395.453 ... 59395.453 59395.453
    ANGLE        (nj, ni) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
    ANGLET       (nj, ni) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
    lont_bounds  (nj, ni, nvertices) float32 320.0 321.125 ... 326.75 325.625
    latt_bounds  (nj, ni, nvertices) float32 -79.48714 -79.48714 ... -76.28169
    lonu_bounds  (nj, ni, nvertices) float32 320.5625 321.6875 ... 326.1875
    latu_bounds  (nj, ni, nvertices) float32 -79.22052 -79.22052 ... -76.01522
Attributes:
    history:  \n2019-04-08 16:48:00.562852 esmlab.resample(<DATASET>, freq="a...

In [8]: ds_ann.time
Out[8]:
<xarray.DataArray 'time' (time: 1)>
array([cftime.DatetimeNoLeap(61, 7, 2, 12, 0, 0, 0, 5, 183)], dtype=object)
Coordinates:
  * time     (time) object 0061-07-02 12:00:00
Attributes:
    long_name:  model time
    bounds:     time_bounds

In [9]: ds_ann.time_bounds
Out[9]:
<xarray.DataArray 'time_bounds' (time: 1, d2: 2)>
array([[21900., 22265.]])
Coordinates:
  * time     (time) object 0061-07-02 12:00:00
Dimensions without coordinates: d2
Attributes:
    long_name:  boundaries for time-averaging interval

Can you post a small snippet of your computation here for debugging purposes?

@andersy005
Copy link
Contributor

@klindsay28, never mind.. I was quick to jump to a conclusion. You are absolutely right

In [4]: ds.time_bounds
Out[4]:
<xarray.DataArray 'time_bounds' (time: 365, d2: 2)>
array([[cftime.DatetimeNoLeap(61, 1, 1, 0, 0, 0, 0, 5, 1),
        cftime.DatetimeNoLeap(61, 1, 2, 0, 0, 0, 0, 6, 2)],
       [cftime.DatetimeNoLeap(61, 1, 2, 0, 0, 0, 0, 6, 2),
        cftime.DatetimeNoLeap(61, 1, 3, 0, 0, 0, 0, 0, 3)],
       [cftime.DatetimeNoLeap(61, 1, 3, 0, 0, 0, 0, 0, 3),
        cftime.DatetimeNoLeap(61, 1, 4, 0, 0, 0, 0, 1, 4)],
       ...,
       [cftime.DatetimeNoLeap(61, 12, 29, 0, 0, 0, 0, 3, 363),
        cftime.DatetimeNoLeap(61, 12, 30, 0, 0, 0, 0, 4, 364)],
       [cftime.DatetimeNoLeap(61, 12, 30, 0, 0, 0, 0, 4, 364),
        cftime.DatetimeNoLeap(61, 12, 31, 0, 0, 0, 0, 5, 365)],
       [cftime.DatetimeNoLeap(61, 12, 31, 0, 0, 0, 0, 5, 365),
        cftime.DatetimeNoLeap(62, 1, 1, 0, 0, 0, 0, 6, 1)]], dtype=object)
Coordinates:
  * time     (time) object 0061-01-02 00:00:00 ... 0062-01-01 00:00:00
Dimensions without coordinates: d2
Attributes:
    long_name:  boundaries for time-averaging interval
In [9]: ds_ann.time_bounds
Out[9]:
<xarray.DataArray 'time_bounds' (time: 1, d2: 2)>
array([[21900., 22265.]])

@andersy005
Copy link
Contributor

I will fix this by tomorrow

@andersy005 andersy005 added this to To do in To Do List via automation Apr 10, 2019
@andersy005 andersy005 added this to the sprint-apr1-apr14 milestone Apr 10, 2019
To Do List automation moved this from To do to Done Apr 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
To Do List
  
Done
Development

No branches or pull requests

2 participants