Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing test_calc_basic tests after updating xarray from 10.2 to 10.3 #268

Closed
spencerahill opened this issue Apr 25, 2018 · 3 comments
Closed

Comments

@spencerahill
Copy link
Owner

Originally noted here

All of the failures are the same error. Note that this one doesn't involve regional averaging, so it's unrelated to what I'm implementing in #266. Here's the traceback of one:

$ py.test test_calc_basic.py::TestCalc3D::test_monthly_ts
================================================================== test session starts ==================================================================
platform darwin -- Python 3.6.3, pytest-3.2.3, py-1.5.1, pluggy-0.4.0
rootdir: /Users/shill/Dropbox/py/aospy, inifile: setup.cfg
plugins: catchlog-1.2.2, hypothesis-3.50.2
collected 1 item

test_calc_basic.py F

======================================================================= FAILURES ========================================================================
______________________________________________________________ TestCalc3D.test_monthly_ts _______________________________________________________________

self = <aospy.test.test_calc_basic.TestCalc3D testMethod=test_monthly_ts>

    def test_monthly_ts(self):
        calc = Calc(intvl_out=1, dtype_out_time='ts', **self.test_params)
>       calc.compute()

calc       = <aospy.Calc instance: sphum, example_proj, example_model, example_run>
self       = <aospy.test.test_calc_basic.TestCalc3D testMethod=test_monthly_ts>

test_calc_basic.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../calc.py:569: in compute
    self.end_date),
../calc.py:415: in _get_all_data
    for n, var in enumerate(self.variables)]
../calc.py:415: in <listcomp>
    for n, var in enumerate(self.variables)]
../calc.py:367: in _get_input_data
    **self.data_loader_attrs)
../data_loader.py:278: in load_variable
    ds, min_year, max_year = _prep_time_data(ds)
../data_loader.py:180: in _prep_time_data
    ds = times.ensure_time_avg_has_cf_metadata(ds)
../utils/times.py:417: in ensure_time_avg_has_cf_metadata
    raw_start_date = ds[TIME_BOUNDS_STR].isel(**{TIME_STR: 0, BOUNDS_STR: 0})
../../../../miniconda3/envs/py36/lib/python3.6/site-packages/xarray/core/dataarray.py:754: in isel
    ds = self._to_temp_dataset().isel(drop=drop, **indexers)
../../../../miniconda3/envs/py36/lib/python3.6/site-packages/xarray/core/dataset.py:1391: in isel
    indexers_list = self._validate_indexers(indexers)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <xarray.Dataset>
Dimensions:       (bounds: 2)
Coordinates:
    time_bounds   (bounds) float64 dask.array<shape=(2,), ...    time_weights  float64 ...
Data variables:
    <this-array>  (bounds) float64 dask.array<shape=(2,), chunksize=(2,)>
indexers = {'bounds': 0, 'time': 0}

    def _validate_indexers(self, indexers):
        """ Here we make sure
            + indexer has a valid keys
            + indexer is in a valid data type
            """
        from .dataarray import DataArray

        invalid = [k for k in indexers if k not in self.dims]
        if invalid:
>           raise ValueError("dimensions %r do not exist" % invalid)
E           ValueError: dimensions ['time'] do not exist

DataArray  = <class 'xarray.core.dataarray.DataArray'>
indexers   = {'bounds': 0, 'time': 0}
invalid    = ['time']
self       = <xarray.Dataset>
Dimensions:       (bounds: 2)
Coordinates:
    time_bounds   (bounds) float64 dask.array<shape=(2,), ...    time_weights  float64 ...
Data variables:
    <this-array>  (bounds) float64 dask.array<shape=(2,), chunksize=(2,)>
@spencerahill
Copy link
Owner Author

spencerahill commented Apr 25, 2018

Basically in the failing tests the loaded dataset's time bounds array is not indexed by the time array:

pp ds
<xarray.Dataset>
Dimensions:       (bounds: 2, lat: 64, lat_bounds: 65, lon: 128, lon_bounds: 129, pfull: 30, phalf: 31, time: 1)
Coordinates:
  * lon_bounds    (lon_bounds) float64 -1.406 1.406 4.219 7.031 9.844 12.66 ...
  * lon           (lon) float64 0.0 2.812 5.625 8.438 11.25 14.06 16.88 ...
  * lat_bounds    (lat_bounds) float64 -90.0 -86.58 -83.76 -80.96 -78.16 ...
  * lat           (lat) float64 -87.86 -85.1 -82.31 -79.53 -76.74 -73.95 ...
  * phalf         (phalf) float64 0.0 9.202 12.44 16.66 22.07 28.97 37.63 ...
    bk            (phalf) float32 dask.array<shape=(31,), chunksize=(31,)>
    pk            (phalf) float32 dask.array<shape=(31,), chunksize=(31,)>
  * pfull         (pfull) float64 3.385 10.78 14.5 19.3 25.44 33.2 42.9 ...
    time_bounds   (bounds) float64 dask.array<shape=(2,), chunksize=(2,)>
  * bounds        (bounds) float64 1.0 2.0
    time_weights  float64 ...
  * time          (time) float64 1.841e+03
Data variables:
    ps            (lat, lon) float32 dask.array<shape=(64, 128), chunksize=(64, 128)>
    sphum         (pfull, lat, lon) float32 dask.array<shape=(30, 64, 128), chunksize=(30, 64, 128)>
Attributes:
    coordinates:  time

This is probably related to the fact that the time array is length-1: all of the failures are for the TestCalc3D class, which has only one year of data.

Looking at the xarray 10.3 what's new, here's a potential culprit: pydata/xarray#2048

@spencerahill
Copy link
Owner Author

I have to run for now, but it's something in our preprocess func that's causing the problem:

(Pdb) xr.open_mfdataset(file_set, concat_dim='time')['time_bounds']
<xarray.DataArray 'time_bounds' (time: 1, nv: 2)>
dask.array<shape=(1, 2), dtype=timedelta64[ns], chunksize=(1, 2)>
Coordinates:
  * nv       (nv) float64 1.0 2.0
  * time     (time) object    6-01-17 00:00:00
Attributes:
    long_name:  time axis boundaries
(Pdb) xr.open_mfdataset(file_set, preprocess=func, concat_dim='time')['time_bounds']
<xarray.DataArray 'time_bounds' (bounds: 2)>
array([157680000000000000, 160358400000000000], dtype='timedelta64[ns]')
Coordinates:
    time_bounds   (bounds) timedelta64[ns] 1825 days 1856 days
  * bounds        (bounds) float64 1.0 2.0
    time_weights  timedelta64[ns] 31 days
Attributes:
    long_name:  time axis boundaries

@spencerkclark
Copy link
Collaborator

spencerkclark commented Apr 26, 2018

Upon looking at things more closely, this is failing in _prep_time_data at the times.ensure_time_avg_has_cf_metadata(ds) step. The existing line above this step is a workaround meant to address this very issue:

aospy/aospy/data_loader.py

Lines 177 to 178 in f240c72

ds = times.ensure_time_as_dim(ds)
ds, min_year, max_year = times.numpy_datetime_workaround_encode_cf(ds)

I think times.ensure_time_as_dim was written before expand_dims existed in Xarray, so the logic there might be overdue for a clean up. Could that be where things are going wrong? Maybe focus your attention there in #269?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants