Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScipyArrayWrapper' object has no attribute 'oindex' #8909

Open
5 tasks done
ocraft opened this issue Apr 4, 2024 · 6 comments · May be fixed by #8921
Open
5 tasks done

ScipyArrayWrapper' object has no attribute 'oindex' #8909

ocraft opened this issue Apr 4, 2024 · 6 comments · May be fixed by #8921
Labels
bug needs triage Issue that has not been reviewed by xarray team member topic-backends

Comments

@ocraft
Copy link

ocraft commented Apr 4, 2024

What happened?

Exception ScipyArrayWrapper' object has no attribute 'oindex' when trying to save dataset into netcdf file after selecting subset from dataset previously loaded from another netcdf file.

What did you expect to happen?

No response

Minimal Complete Verifiable Example

import xarray as xr

ds = xr.Dataset()
ds['A'] = xr.DataArray([[1, 'a'], [2,'b']],dims=['x','y'])
ds.to_netcdf('test.nc')
ds2 = xr.open_dataset('test.nc')
ds2.sel(y=[1]).to_netcdf('test.nc')

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

File ~/Workspace/phd/.venv/lib/python3.10/site-packages/xarray/core/indexing.py:342, in IndexCallable.__getitem__(self, key)
    341 def __getitem__(self, key: Any) -> Any:
--> 342     return self.getter(key)

File ~/Workspace/phd/.venv/lib/python3.10/site-packages/xarray/coding/variables.py:72, in _ElementwiseFunctionArray._oindex_get(self, key)
     71 def _oindex_get(self, key):
---> 72     return type(self)(self.array.oindex[key], self.func, self.dtype)

File ~/Workspace/phd/.venv/lib/python3.10/site-packages/xarray/core/indexing.py:342, in IndexCallable.__getitem__(self, key)
    341 def __getitem__(self, key: Any) -> Any:
--> 342     return self.getter(key)

File ~/Workspace/phd/.venv/lib/python3.10/site-packages/xarray/coding/strings.py:256, in StackedBytesArray._oindex_get(self, key)
    255 def _oindex_get(self, key):
--> 256     return _numpy_char_to_bytes(self.array.oindex[key])

AttributeError: 'ScipyArrayWrapper' object has no attribute 'oindex'

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:26:04) [GCC 10.4.0] python-bits: 64 OS: Linux OS-release: 6.5.0-26-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.2 libnetcdf: None

xarray: 2024.3.0
pandas: 2.2.1
numpy: 1.26.4
scipy: 1.13.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: 3.10.0
Nio: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: 1.3.8
dask: 2024.4.0
distributed: 2024.4.0
matplotlib: 3.8.4
cartopy: None
seaborn: 0.13.2
numbagg: None
fsspec: 2024.3.1
cupy: None
pint: None
sparse: None
flox: 0.9.6
numpy_groupies: 0.10.2
setuptools: 63.2.0
pip: 24.0
conda: None
pytest: 8.1.1
mypy: None
IPython: 8.23.0
sphinx: None

@ocraft ocraft added bug needs triage Issue that has not been reviewed by xarray team member labels Apr 4, 2024
Copy link

welcome bot commented Apr 4, 2024

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@ocraft
Copy link
Author

ocraft commented Apr 4, 2024

It's important to note, that the example works on xarray==2024.2.0, the problem exists in 2024.3.0.

@dcherian
Copy link
Contributor

dcherian commented Apr 4, 2024

Yes sorry about that.

@andersy005 we should probably roll back the changes to coding/*.py and bundle them in the backends feature branch

@dcherian
Copy link
Contributor

dcherian commented Apr 4, 2024

FWIW I can't reproduce even when forcing it to write a netcdf3 file with engine="scipy"

@andersy005
Copy link
Member

i was able to reproduce the issue from a fresh environment:

mamba create -n test 'python=3.12' xarray scipy ipython distributed 
In [6]: ds2 = xr.open_dataset('/tmp/test.nc')

In [7]: ds2.sel(y=[1])
Out[7]: 
<xarray.Dataset> Size: 16B
Dimensions:  (x: 2, y: 1)
Dimensions without coordinates: x, y
Data variables:
    A        (x, y) object 16B ...

In [8]: ds2.sel(y=[1]).to_netcdf('/tmp/ttest.nc')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[8], line 1
----> 1 ds2.sel(y=[1]).to_netcdf('/tmp/ttest.nc')

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/dataset.py:2298, in Dataset.to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf)
   2295     encoding = {}
   2296 from xarray.backends.api import to_netcdf
-> 2298 return to_netcdf(  # type: ignore  # mypy cannot resolve the overloads:(
   2299     self,
   2300     path,
   2301     mode=mode,
   2302     format=format,
   2303     group=group,
   2304     engine=engine,
   2305     encoding=encoding,
   2306     unlimited_dims=unlimited_dims,
   2307     compute=compute,
   2308     multifile=False,
   2309     invalid_netcdf=invalid_netcdf,
   2310 )

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/backends/api.py:1339, in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf)
   1334 # TODO: figure out how to refactor this logic (here and in save_mfdataset)
   1335 # to avoid this mess of conditionals
   1336 try:
   1337     # TODO: allow this work (setting up the file for writing array data)
   1338     # to be parallelized with dask
-> 1339     dump_to_store(
   1340         dataset, store, writer, encoding=encoding, unlimited_dims=unlimited_dims
   1341     )
   1342     if autoclose:
   1343         store.close()

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/backends/api.py:1386, in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims)
   1383 if encoder:
   1384     variables, attrs = encoder(variables, attrs)
-> 1386 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/backends/common.py:393, in AbstractWritableDataStore.store(self, variables, attributes, check_encoding_set, writer, unlimited_dims)
    390 if writer is None:
    391     writer = ArrayWriter()
--> 393 variables, attributes = self.encode(variables, attributes)
    395 self.set_attributes(attributes)
    396 self.set_dimensions(variables, unlimited_dims=unlimited_dims)

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/backends/common.py:482, in WritableCFDataStore.encode(self, variables, attributes)
    479 def encode(self, variables, attributes):
    480     # All NetCDF files get CF encoded by default, without this attempting
    481     # to write times, for example, would fail.
--> 482     variables, attributes = cf_encoder(variables, attributes)
    483     variables = {k: self.encode_variable(v) for k, v in variables.items()}
    484     attributes = {k: self.encode_attribute(v) for k, v in attributes.items()}

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/conventions.py:795, in cf_encoder(variables, attributes)
    792 # add encoding for time bounds variables if present.
    793 _update_bounds_encoding(variables)
--> 795 new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()}
    797 # Remove attrs from bounds variables (issue #2921)
    798 for var in new_vars.values():

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/conventions.py:196, in encode_cf_variable(var, needs_copy, name)
    183 ensure_not_multiindex(var, name=name)
    185 for coder in [
    186     times.CFDatetimeCoder(),
    187     times.CFTimedeltaCoder(),
   (...)
    194     variables.BooleanCoder(),
    195 ]:
--> 196     var = coder.encode(var, name=name)
    198 # TODO(kmuehlbauer): check if ensure_dtype_not_object can be moved to backends:
    199 var = ensure_dtype_not_object(var, name=name)

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/coding/times.py:972, in CFDatetimeCoder.encode(self, variable, name)
    970 def encode(self, variable: Variable, name: T_Name = None) -> Variable:
    971     if np.issubdtype(
--> 972         variable.data.dtype, np.datetime64
    973     ) or contains_cftime_datetimes(variable):
    974         dims, data, attrs, encoding = unpack_for_encoding(variable)
    976         units = encoding.pop("units", None)

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/variable.py:433, in Variable.data(self)
    431     return self._data
    432 elif isinstance(self._data, indexing.ExplicitlyIndexed):
--> 433     return self._data.get_duck_array()
    434 else:
    435     return self.values

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:809, in MemoryCachedArray.get_duck_array(self)
    808 def get_duck_array(self):
--> 809     self._ensure_cached()
    810     return self.array.get_duck_array()

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:803, in MemoryCachedArray._ensure_cached(self)
    802 def _ensure_cached(self):
--> 803     self.array = as_indexable(self.array.get_duck_array())

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:760, in CopyOnWriteArray.get_duck_array(self)
    759 def get_duck_array(self):
--> 760     return self.array.get_duck_array()

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:619, in LazilyIndexedArray.get_duck_array(self)
    617 def get_duck_array(self):
    618     if isinstance(self.array, ExplicitlyIndexedNDArrayMixin):
--> 619         array = apply_indexer(self.array, self.key)
    620     else:
    621         # If the array is not an ExplicitlyIndexedNDArrayMixin,
    622         # it may wrap a BackendArray so use its __getitem__
    623         array = self.array[self.key]

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:1000, in apply_indexer(indexable, indexer)
    998     return indexable.vindex[indexer]
    999 elif isinstance(indexer, OuterIndexer):
-> 1000     return indexable.oindex[indexer]
   1001 else:
   1002     return indexable[indexer]

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:342, in IndexCallable.__getitem__(self, key)
    341 def __getitem__(self, key: Any) -> Any:
--> 342     return self.getter(key)

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/coding/variables.py:72, in _ElementwiseFunctionArray._oindex_get(self, key)
     71 def _oindex_get(self, key):
---> 72     return type(self)(self.array.oindex[key], self.func, self.dtype)

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:342, in IndexCallable.__getitem__(self, key)
    341 def __getitem__(self, key: Any) -> Any:
--> 342     return self.getter(key)

File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/coding/strings.py:256, in StackedBytesArray._oindex_get(self, key)
    255 def _oindex_get(self, key):
--> 256     return _numpy_char_to_bytes(self.array.oindex[key])

AttributeError: 'ScipyArrayWrapper' object has no attribute 'oindex'

In [9]: xr.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:54:21) [Clang 16.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 23.4.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2024.3.0
pandas: 2.2.1
numpy: 1.26.4
scipy: 1.13.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.4.1
distributed: 2024.4.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.3.1
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 69.2.0
pip: 24.0
conda: None
pytest: None
mypy: None
IPython: 8.22.2
sphinx: None

dbaston added a commit to dbaston/exactextract that referenced this issue Apr 12, 2024
gmaze added a commit to euroargodev/argopy that referenced this issue Apr 18, 2024
@FiND-Tao
Copy link

I tried "pip install xarray==0.20.1 scipy==1.7.1" and it removed the error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needs triage Issue that has not been reviewed by xarray team member topic-backends
Projects
None yet
5 participants