-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resample not working when time coordinate is timezone aware #1490
Comments
Did some digging. Note here that the dtypes of >>> time2
DatetimeIndex(['2000-01-01 00:00:00-05:00', '2000-01-01 01:00:00-05:00',
'2000-01-01 02:00:00-05:00', '2000-01-01 03:00:00-05:00',
'2000-01-01 04:00:00-05:00', '2000-01-01 05:00:00-05:00',
'2000-01-01 06:00:00-05:00', '2000-01-01 07:00:00-05:00',
'2000-01-01 08:00:00-05:00', '2000-01-01 09:00:00-05:00',
...
'2000-12-30 14:00:00-05:00', '2000-12-30 15:00:00-05:00',
'2000-12-30 16:00:00-05:00', '2000-12-30 17:00:00-05:00',
'2000-12-30 18:00:00-05:00', '2000-12-30 19:00:00-05:00',
'2000-12-30 20:00:00-05:00', '2000-12-30 21:00:00-05:00',
'2000-12-30 22:00:00-05:00', '2000-12-30 23:00:00-05:00'],
dtype='datetime64[ns, EST]', length=8760, freq='H') But, if we directly print its values, we get something slightly different: >>> time2.values
array(['2000-01-01T05:00:00.000000000', '2000-01-01T06:00:00.000000000',
'2000-01-01T07:00:00.000000000', ...,
'2000-12-31T02:00:00.000000000', '2000-12-31T03:00:00.000000000',
'2000-12-31T04:00:00.000000000'], dtype='datetime64[ns]') The difference is that the timezone delta has been automatically added in terms of hours to each value in import pandas as pd
import xarray as xr
time1 = pd.date_range('2000-01-01', freq='H', periods=365 * 24) #timezone naïve
time2 = pd.date_range('2000-01-01', freq='H', periods=365 * 24, tz='UTC') #timezone aware
ds1 = xr.Dataset({'foo': ('time', np.arange(365 * 24)), 'time': time1.values})
ds2 = xr.Dataset({'foo': ('time', np.arange(365 * 24)), 'time': time2.values})
ds1.resample('3H', 'time', how='mean') # works fine
ds2.resample('3H', 'time', how='mean') # works fine Both >>> np.dtype('datetime64[ns]')
dtype('<M8[ns]') But this won't: >>> np.dtype('datetime64[ns, UTC]')
TypeError: Invalid datetime unit in metadata string "[ns, UC]" But also, the type of So what happens is that the resulting One solution would be to catch this potential glitch in either |
NumPy doesn't support timezones, but pandas does. This puts things in a slightly tricky position for xarray. We do manage to get things to work for pandas dtypes stored in indexes, in most cases. Given that our resampling behavior also relies on pandas, I think we should be able to get this work, probably by tweaking our PandasIndexAdapter, as @darothen notes. It's borderline whether this is a new bug or feature, but this would certainly be nice to fix if possible, so I'm marking this as "Contributions welcome". |
In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here or remove the |
In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here or remove the |
Progress to end of 2022: Trying to incorporate temporal filtering into exposure tidal monitoring. Prototype workflow used pandas. Current approach using xarray. See the following issue for discussion of timezone aware datetimes in xarray pydata/xarray#1490
hi all,
here is the code to reproduce the bug
This last line returns the following error:
My config:
xarray==0.9.6
pandas==0.20.3
numpy==1.13.1
python-dateutil==2.6.1
six==1.10.0
pytz==2017.2
Tested on python 2.7 and python 3.5.2
The text was updated successfully, but these errors were encountered: