Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame.to_xarray produces FutureWarning for DatetimeTZ data #24716

Closed
TomAugspurger opened this issue Jan 11, 2019 · 3 comments · Fixed by #30516
Closed

DataFrame.to_xarray produces FutureWarning for DatetimeTZ data #24716

TomAugspurger opened this issue Jan 11, 2019 · 3 comments · Fixed by #30516
Labels
Timeseries Timezones Timezone data dtype
Milestone

Comments

@TomAugspurger
Copy link
Contributor

In [16]: df = pd.DataFrame({"A": pd.date_range('2000', periods=12, tz='US/Central')})

In [17]: df.to_xarray()
/Users/taugspurger/Envs/pandas-dev/lib/python3.7/site-packages/xarray/core/dataset.py:3111: FutureWarning: Converting timezone-aware DatetimeArray to timezone-naive ndarray with 'datetime64[ns]' dtype. In the future, this will return an ndarray with 'object' dtype where each element is a 'pandas.Timestamp' with the correct 'tz'.
        To accept the future behavior, pass 'dtype=object'.
        To keep the old behavior, pass 'dtype="datetime64[ns]"'.
  data = np.asarray(series).reshape(shape)
Out[17]:
<xarray.Dataset>
Dimensions:  (index: 12)
Coordinates:
  * index    (index) int64 0 1 2 3 4 5 6 7 8 9 10 11
Data variables:
    A        (index) datetime64[ns] 2000-01-01T06:00:00 ... 2000-01-12T06:00:00

@shoyer thoughts on how to resolve this? We can continue dropping the timezone and passing datetime64[ns], break API and return an object-dtype array of timestamps, or add a parameter so that the user can control this.

@shoyer
Copy link
Member

shoyer commented Jan 11, 2019

This is a little tough: xarray doesn't know what users would prefer in general, either.

In principle, the old behavior was a bug, since we were silently dropping timezones -- though I'm sure some users may have been relying on this. Perhaps the cleanest solution would be to add a dtypes argument on an xarray side to allow users to silence the warning.

Either way, this is definitely an xarray bug (pydata/xarray#2666), because xarray handle pandas's to_xarray() method by forwarding to xarray.Dataset.from_dataframe.

For pandas, I would suggest that to_xarray() be augmented to pass on **kwargs to the xarray from_dataframe and from_series constructors. That would give us the flexibility to fix this on the xarray side.

@jbrockmendel
Copy link
Member

@shoyer I’ve been agitating to make EA allow non-1D arrays, partly so it could be portable enough to be useful directly in e.g. xarray. If we were to put that in place, would just reshaping the EA be a viable option here?

@shoyer
Copy link
Member

shoyer commented Jan 11, 2019

I don’t know if it’s worth the trouble of ensuring that all extension arrays can handle n-dimensions. My thinking was that the best way to handle this would be to write an adapter that adds shape information to flat pandas objects (either Series or ExtensionArray) that conforms to numpy’s new ‘array_function‘ protocol. That’s basically what we would want for xarray.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Timeseries Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants