Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-place addition of arrays with the same coords but in a different order #3910

Closed
mancellin opened this issue Mar 28, 2020 · 7 comments · Fixed by #3976
Closed

In-place addition of arrays with the same coords but in a different order #3910

mancellin opened this issue Mar 28, 2020 · 7 comments · Fixed by #3976

Comments

@mancellin
Copy link
Contributor

I have two DataArrays with the same dimension, but the index is in a different order.
Adding them with A + B works fine, but the in-place addition fails.

MCVE Code Sample

import numpy as np
import xarray as xr

n = 5

d1 = np.arange(n)
np.random.shuffle(d1)
A = xr.DataArray(np.ones(n), coords=[('dim', d1)])

d2 = np.arange(n)
np.random.shuffle(d2)
B = xr.DataArray(np.ones(n), coords=[('dim', d2)])

print(A + B)
A += B

Expected Output

A = A + B is working fine. I would expect A += B to do the same.

Problem Description

The in-place addition A += B fails:

Traceback (most recent call last):
  File "/home/matthieu/xarray-test.py", line 15, in <module>
    A += B
  File "/opt/anaconda3/envs/xarray-tests/lib/python3.8/site-packages/xarray/core/dataarray.py", line 2619, in func
    with self.coords._merge_inplace(other_coords):
  File "/opt/anaconda3/envs/xarray-tests/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/opt/anaconda3/envs/xarray-tests/lib/python3.8/site-packages/xarray/core/coordinates.py", line 140, in _merge_inplace
    variables, indexes = merge_coordinates_without_align(
  File "/opt/anaconda3/envs/xarray-tests/lib/python3.8/site-packages/xarray/core/merge.py", line 328, in merge_coordinates_withou
t_align
    return merge_collected(filtered, prioritized)
  File "/opt/anaconda3/envs/xarray-tests/lib/python3.8/site-packages/xarray/core/merge.py", line 210, in merge_collected
    raise MergeError(
xarray.core.merge.MergeError: conflicting values for index 'dim' on objects to be combined:
first value: Int64Index([1, 2, 0, 3, 4], dtype='int64', name='dim')
second value: Int64Index([4, 2, 3, 1, 0], dtype='int64', name='dim')

Versions

Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.8.2 (default, Mar 26 2020, 15:53:00) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.19.112-1-MANJARO machine: x86_64 processor: byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: fr_FR.UTF-8 libhdf5: None libnetcdf: None

xarray: 0.15.0
pandas: 1.0.3
numpy: 1.18.1
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
setuptools: 46.1.1.post20200323
pip: 20.0.2
conda: None
pytest: None
IPython: None
sphinx: None

@max-sixty
Copy link
Collaborator

max-sixty commented Mar 28, 2020

Yes, this is unfortunate. The reasoning:

# n.b. we can't align other to self (with other.reindex_like(self))
# because `other` may be converted into floats, which would cause
# in-place arithmetic to fail unpredictably. Instead, we simply
# don't support automatic alignment with in-place arithmetic.

It may be possible to at align in some cases (e.g. if the indexes are bijective / one-to-one, or the values are already floats). Or a better error message; even one containing that comment would be better.

@dcherian
Copy link
Contributor

@mancellin Are you up for sending in a PR with a better error message?

@mancellin
Copy link
Contributor Author

I can submit a PR. But the comment cited above is not totally clear to me.

The purpose of the conversion to floats is to have NaNs in case the shapes do not match. So the core of the issue is that A + B might not have the same shape as A, and thus in general A + B cannot replace A in-place.
Is that right?

@max-sixty
Copy link
Collaborator

Thanks in advance @mancellin

Your comment is almost exactly right. It's that they might not align fully, rather than the shape; i.e. if your example had range(1,5) rather than range(0,4), then the array would need to be converted to float to add a NaN. Does that make sense?

@mancellin
Copy link
Contributor Author

Yes. But the not-in-place addition A+B works fine without conversion to float because it uses basically xr.align(A, B, join='inner'). If the in-place addition did the same, there would be no risk of type conversion. But I guess the in-place version would rather use something like xr.align(A, B, join='left') to guarantee that the shape and index of A does not change. Am I right?

@max-sixty
Copy link
Collaborator

Yes exactly! And I think in-place might be surprising if it changed the indexes of the left item; i.e. if you got this result from A += B:

In [17]:
    ...:
    ...: import numpy as np
    ...: import xarray as xr
    ...:
    ...: n = 5
    ...:
    ...: d1 = np.arange(1, n+1)
    ...: np.random.shuffle(d1)
    ...: A = xr.DataArray(np.ones(n), coords=[('dim', d1)])
    ...:
    ...: d2 = np.arange(n)
    ...: np.random.shuffle(d2)
    ...: B = xr.DataArray(np.ones(n), coords=[('dim', d2)])
    ...:
    ...: A + B
Out[17]:
<xarray.DataArray (dim: 4)>
array([2., 2., 2., 2.])
Coordinates:
  * dim      (dim) int64 3 2 1 4

@mancellin
Copy link
Contributor Author

Well actually, I would be at least as surprised if A += B returned a different result than A = A + B.

So now I understand why this is not supported. I'll submit the PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants