Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError for time_bnds variable when calling Dataset.to_netcdf #7794

Open
4 tasks done
sol1105 opened this issue Apr 28, 2023 · 7 comments · May be fixed by #8821
Open
4 tasks done

TypeError for time_bnds variable when calling Dataset.to_netcdf #7794

sol1105 opened this issue Apr 28, 2023 · 7 comments · May be fixed by #8821

Comments

@sol1105
Copy link

sol1105 commented Apr 28, 2023

What happened?

In the workflow I load the dataset (clisops), remap it (xesmf), chunk it according to the available memory, then write it to disk (clisops). I encounter however a TypeError when writing the Dataset to disk, since xarray>=2023.3.0.

What did you expect to happen?

The Dataset to be written to disk.

Minimal Complete Verifiable Example

import xarray as xr
import os

ds_url="https://github.com/roocs/mini-esgf-data/raw/master/test_data/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/day/land/day/r1i1p1/latest/mrsos/mrsos_day_HadGEM2-ES_rcp85_r1i1p1_20051201.nc"
ds_path="mrsos_day_HadGEM2-ES_rcp85_r1i1p1_20051201.nc"
if not os.path.isfile(ds_path): os.system(f"wget {ds_url}")

ds=xr.open_dataset(ds_path)

# With printing the values of the time_bnds variable, the to_netcdf-call later fails
# This is basically simulating the processing of the data
print(ds["time_bnds"].values)

chunked_ds_in = ds.chunk({"time":1})

chunked_ds_in.to_netcdf(path="input.nc", compute=True)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

> TypeError: Invalid array type: <class 'cftime._cftime.Datetime360Day'>

I think the problem comes mainly from the change in
> xarray/core/common.py", line 1811, in _contains_cftime_datetimes
with xarray 2023.3.0

At least the problems disappears when I use the old implementation of `_contains_cftime_datetimes`.

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.10.10 | packaged by conda-forge | (main, Mar 24 2023, 20:08:06) [GCC 11.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-147-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.0
libnetcdf: 4.9.2

xarray: 2023.4.2
pandas: 1.5.3
numpy: 1.23.5
scipy: 1.10.1
netCDF4: 1.6.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2023.3.2
distributed: 2023.3.2
matplotlib: 3.7.1
cartopy: None
seaborn: None
numbagg: None
fsspec: 2023.3.0
cupy: None
pint: None
sparse: 0.14.0
flox: None
numpy_groupies: None
setuptools: 67.6.1
pip: 23.0.1
conda: None
pytest: 7.2.2
mypy: None
IPython: 8.12.0
sphinx: 6.1.3

@sol1105 sol1105 added bug needs triage Issue that has not been reviewed by xarray team member labels Apr 28, 2023
@welcome
Copy link

welcome bot commented Apr 28, 2023

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@Illviljan
Copy link
Contributor

Why/where do you get a cftime._cftime.Datetime360Day ? It is deprecated according to cftime.

@spencerkclark
Copy link
Member

spencerkclark commented Apr 29, 2023

The cftime.Datetime360Day objects are still expected here. We have not switched over to the universal cftime.datetime class yet within xarray, though my sense is this would likely still be an issue regardless (the traceback indicates the error comes up in xarray's indexing logic). cftime.Datetime360Day instances are instances of cftime.datetime objects:

>>> import cftime
>>> isinstance(cftime.Datetime360Day(2000, 1, 1), cftime.datetime)
True

This is the full traceback for reference:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/spencer/software/xarray/xarray/core/dataset.py", line 1917, in to_netcdf
    return to_netcdf(  # type: ignore  # mypy cannot resolve the overloads:(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/software/xarray/xarray/backends/api.py", line 1216, in to_netcdf
    dump_to_store(
  File "/Users/spencer/software/xarray/xarray/backends/api.py", line 1263, in dump_to_store
    store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
  File "/Users/spencer/software/xarray/xarray/backends/common.py", line 269, in store
    variables, attributes = self.encode(variables, attributes)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/software/xarray/xarray/backends/common.py", line 358, in encode
    variables, attributes = cf_encoder(variables, attributes)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/software/xarray/xarray/conventions.py", line 773, in cf_encoder
    _update_bounds_encoding(variables)
  File "/Users/spencer/software/xarray/xarray/conventions.py", line 347, in _update_bounds_encoding
    ) or contains_cftime_datetimes(v)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/software/xarray/xarray/core/common.py", line 1818, in contains_cftime_datetimes
    return _contains_cftime_datetimes(var._data)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/software/xarray/xarray/core/common.py", line 1811, in _contains_cftime_datetimes
    return isinstance(np.asarray(sample).item(), cftime.datetime)
                      ^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/Software/miniconda3/envs/xarray-tests-py311/lib/python3.11/site-packages/dask/array/core.py", line 1700, in __array__
    x = self.compute()
        ^^^^^^^^^^^^^^
  File "/Users/spencer/Software/miniconda3/envs/xarray-tests-py311/lib/python3.11/site-packages/dask/base.py", line 314, in compute
    (result,) = compute(self, traverse=False, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/Software/miniconda3/envs/xarray-tests-py311/lib/python3.11/site-packages/dask/base.py", line 599, in compute
    results = schedule(dsk, keys, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/Software/miniconda3/envs/xarray-tests-py311/lib/python3.11/site-packages/dask/threaded.py", line 89, in get
    results = get_async(
              ^^^^^^^^^^
  File "/Users/spencer/Software/miniconda3/envs/xarray-tests-py311/lib/python3.11/site-packages/dask/local.py", line 511, in get_async
    raise_exception(exc, tb)
  File "/Users/spencer/Software/miniconda3/envs/xarray-tests-py311/lib/python3.11/site-packages/dask/local.py", line 319, in reraise
    raise exc
  File "/Users/spencer/Software/miniconda3/envs/xarray-tests-py311/lib/python3.11/site-packages/dask/local.py", line 224, in execute_task
    result = _execute_task(task, data)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/Software/miniconda3/envs/xarray-tests-py311/lib/python3.11/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/Software/miniconda3/envs/xarray-tests-py311/lib/python3.11/site-packages/dask/array/core.py", line 120, in getter
    c = a[b]
        ~^^^
  File "/Users/spencer/software/xarray/xarray/core/indexing.py", line 490, in __getitem__
    result = self.array[self.indexer_cls(key)]
             ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/software/xarray/xarray/core/indexing.py", line 699, in __getitem__
    return type(self)(_wrap_numpy_scalars(self.array[key]))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/software/xarray/xarray/core/indexing.py", line 686, in __init__
    self.array = _wrap_numpy_scalars(as_indexable(array))
                                     ^^^^^^^^^^^^^^^^^^^
  File "/Users/spencer/software/xarray/xarray/core/indexing.py", line 727, in as_indexable
    raise TypeError(f"Invalid array type: {type(array)}")
TypeError: Invalid array type: <class 'cftime._cftime.Datetime360Day'>

@aulemahal
Copy link
Contributor

aulemahal commented Mar 11, 2024

Changing the _contains_cftime_datetimes function to this fixes the issue: (first line of the second if)

def _contains_cftime_datetimes(array: Any) -> bool:
    """Check if a array inside a Variable contains cftime.datetime objects"""
    if cftime is None:
        return False

    if array.dtype == np.dtype("O") and array.size > 0:
        first_idx = (slice(0, 1),) * array.ndim  # instead of (0,)
        if isinstance(array, ExplicitlyIndexed):
            first_idx = BasicIndexer(first_idx)
        sample = array[first_idx]
        return isinstance(np.asarray(sample).item(), cftime.datetime)

    return False

The only difference is that the index is not a scalar, but a slice of a single element.

The error above shows that the as_indexable function receives a scalar instead of an array of size 1.

@aulemahal
Copy link
Contributor

There it is.

def _wrap_numpy_scalars(array):
"""Wrap NumPy scalars in 0d arrays."""
if np.isscalar(array):
return np.array(array)
else:
return array

import numpy as np

np.isscalar(1)  # True, as expected

import cftime
c = cftime.datetime(2000, 1, 1, calendar='360_day')  # Calendar doesn't matter, this is an example.
np.isscalar(c)  # False, as not expected

We either need to fix numpy or to catch this case in xarray ? I guess the error was raised here because cftime objects are kinda common in xarray (and unknown to numpy), but this could happen for other special types!

@huard
Copy link
Contributor

huard commented Mar 11, 2024

I don't think the fix would be in numpy.
My reading is that np.isscalar is written for Python built-in types and Numpy's types, not generic objects. Its dosctring suggests to use np.ndim instead.

>>> np.ndim(c)
0

huard added a commit to Ouranosinc/xarray that referenced this issue Mar 11, 2024
@sol1105
Copy link
Author

sol1105 commented Mar 12, 2024

@aulemahal @huard Thank you very much for addressing this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants