Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Behavior of dataarray with no dimensions #1726

Closed
0x0L opened this issue Nov 17, 2017 · 3 comments
Closed

Behavior of dataarray with no dimensions #1726

0x0L opened this issue Nov 17, 2017 · 3 comments

Comments

@0x0L
Copy link
Contributor

0x0L commented Nov 17, 2017

Consider

type(np.array([1.0]).mean())
# -> numpy.float64

type(pd.Series([1.0]).mean())
# -> float

type(xr.DataArray([1.0]).mean())
# -> xarray.core.dataarray.DataArray

The issue is that this dimensionless data array won't be cast into float by numpy/pandas when constructing a new ndarray/dataframe. You'll have to do it explicitly. Not a big deal but it feels weird.

I'm sure there's a real technical reason (keeping metadata ?) behind this behavior but I couldn't find any discussion about it.

@0x0L 0x0L changed the title Behaviour of dataarray with no dimensions Behavior of dataarray with no dimensions Nov 17, 2017
@shoyer
Copy link
Member

shoyer commented Nov 18, 2017

Yes, this was an intentional choice, though not without occasional regrets!

This is nice from an consistency perspective: every operation on an xarray object returns another xarray object. Also, as you point out, this allows us to preserve metadata such as attributes or scalar coordinates.

I'm going to mark this as a documentation issue: this would be a great thing to add our docs somewhere (maybe the FAQ page?).

@0x0L
Copy link
Contributor Author

0x0L commented Nov 20, 2017

Thanks for your answer.
I'll try to find a good spot in the doc to write a line or two about this.

@0x0L
Copy link
Contributor Author

0x0L commented Nov 21, 2017

I dug it a bit further. Contrary to what I said numpy is fine. If x is a DataArray

np.array(x.mean())
# -> array(2.0)
np.array([x.mean()])
# -> array([2.0])
np.array(x)
# -> ok

pandas behavior is more surprising

pd.Series(2)
# ok
pd.Series(x.mean())
# -> TypeError: len() of unsized object
pd.Series([x.mean()])
# -> dtype=object
# however for 1D x
pd.Series(x)
# dtype is fine

Something looks a bit odd to me. But it looks more like a pandas issue

@0x0L 0x0L mentioned this issue Nov 21, 2017
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants