Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resample bug after converting tz #1459

Closed
petergx opened this issue Jun 13, 2012 · 4 comments
Closed

Resample bug after converting tz #1459

petergx opened this issue Jun 13, 2012 · 4 comments
Labels
Milestone

Comments

@petergx
Copy link

petergx commented Jun 13, 2012

dr = date_range(start='2012-4-13', end='2012-5-1')
ts = Series(range(len(dr)), dr)
ts_utc = ts.tz_convert('UTC')
ts_local = ts_utc.tz_convert('America/Los_Angeles')
ts_local.resample('W')

yields

~/env/lib/python2.7/site-packages/pandas/tseries/resample.pyc in _get_time_bins(self, axis)
    107 
    108         # general version, knowing nothing about relative frequencies

--> 109         bins = lib.generate_bins_dt64(axis.asi8, binner.asi8, self.closed)
    110 
    111         if self.label == 'right':

~/env/lib/python2.7/site-packages/pandas/lib.so in pandas.lib.generate_bins_dt64 (pandas/src/tseries.c:60938)()

ValueError: Values falls before first bin
@wesm
Copy link
Member

wesm commented Jun 13, 2012

Good point. There's no handling of time zones in resampling, which needs to get fixed ASAP!

wesm added a commit that referenced this issue Jun 14, 2012
@wesm
Copy link
Member

wesm commented Jun 14, 2012

Hey, actually this appears to have been fixed at some point in the next few weeks, working fine in 0.8.0b2. I added a unit test. Though note you have to use tz_localize now to convert a naive time series to localized:

ts_utc = ts.tz_localize('utc')

@wesm wesm closed this as completed Jun 14, 2012
@JackKelly
Copy link
Contributor

This bug (resampling a TZ-aware Series fails if rule='W') appears to be back in Pandas 0.15.2!

In [1]: import pandas as pd

In [2]: rng = pd.date_range("2013-04-01", "2013-05-01", tz='Europe/London', freq='H')

In [3]: rng
Out[3]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-04-01 00:00:00+01:00, ..., 2013-05-01 00:00:00+01:00]
Length: 721, Freq: H, Timezone: Europe/London

In [4]: series = pd.Series(index=rng)

In [5]: series
Out[5]: 
2013-04-01 00:00:00+01:00   NaN
2013-04-01 01:00:00+01:00   NaN
2013-04-01 02:00:00+01:00   NaN
2013-04-01 03:00:00+01:00   NaN
2013-04-01 04:00:00+01:00   NaN
2013-04-01 05:00:00+01:00   NaN
2013-04-01 06:00:00+01:00   NaN
2013-04-01 07:00:00+01:00   NaN
2013-04-01 08:00:00+01:00   NaN
2013-04-01 09:00:00+01:00   NaN
2013-04-01 10:00:00+01:00   NaN
2013-04-01 11:00:00+01:00   NaN
2013-04-01 12:00:00+01:00   NaN
2013-04-01 13:00:00+01:00   NaN
2013-04-01 14:00:00+01:00   NaN
...
2013-04-30 10:00:00+01:00   NaN
2013-04-30 11:00:00+01:00   NaN
2013-04-30 12:00:00+01:00   NaN
2013-04-30 13:00:00+01:00   NaN
2013-04-30 14:00:00+01:00   NaN
2013-04-30 15:00:00+01:00   NaN
2013-04-30 16:00:00+01:00   NaN
2013-04-30 17:00:00+01:00   NaN
2013-04-30 18:00:00+01:00   NaN
2013-04-30 19:00:00+01:00   NaN
2013-04-30 20:00:00+01:00   NaN
2013-04-30 21:00:00+01:00   NaN
2013-04-30 22:00:00+01:00   NaN
2013-04-30 23:00:00+01:00   NaN
2013-05-01 00:00:00+01:00   NaN
Freq: H, Length: 721

In [6]: series.resample('W')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-906374246edd> in <module>()
----> 1 series.resample('W')

/usr/local/lib/python2.7/dist-packages/pandas/core/generic.pyc in resample(self, rule, how, axis, fill_method, closed, label, convention, kind, loffset, limit, base)
   3003                               fill_method=fill_method, convention=convention,
   3004                               limit=limit, base=base)
-> 3005         return sampler.resample(self).__finalize__(self)
   3006 
   3007     def first(self, offset):

/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in resample(self, obj)
     83 
     84         if isinstance(ax, DatetimeIndex):
---> 85             rs = self._resample_timestamps()
     86         elif isinstance(ax, PeriodIndex):
     87             offset = to_offset(self.freq)

/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in _resample_timestamps(self, kind)
    273         axlabels = self.ax
    274 
--> 275         self._get_binner_for_resample(kind=kind)
    276         grouper = self.grouper
    277         binner = self.binner

/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in _get_binner_for_resample(self, kind)
    121             kind = self.kind
    122         if kind is None or kind == 'timestamp':
--> 123             self.binner, bins, binlabels = self._get_time_bins(ax)
    124         elif kind == 'timedelta':
    125             self.binner, bins, binlabels = self._get_time_delta_bins(ax)

/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in _get_time_bins(self, ax)
    182 
    183         # general version, knowing nothing about relative frequencies
--> 184         bins = lib.generate_bins_dt64(ax_values, bin_edges, self.closed, hasnans=ax.hasnans)
    185 
    186         if self.closed == 'right':

/usr/local/lib/python2.7/dist-packages/pandas/lib.so in pandas.lib.generate_bins_dt64 (pandas/lib.c:17825)()

ValueError: Values falls before first bin

In [7]: series.tz_localize(None).resample('W')
Out[7]: 
2013-04-07   NaN
2013-04-14   NaN
2013-04-21   NaN
2013-04-28   NaN
2013-05-05   NaN
Freq: W-SUN, dtype: float64

But resampling the TZ-aware Series to daily works fine:

In [8]: series.resample('D')
Out[8]: 
2013-04-01 00:00:00+01:00   NaN
2013-04-02 00:00:00+01:00   NaN
2013-04-03 00:00:00+01:00   NaN
2013-04-04 00:00:00+01:00   NaN
2013-04-05 00:00:00+01:00   NaN
2013-04-06 00:00:00+01:00   NaN
2013-04-07 00:00:00+01:00   NaN
2013-04-08 00:00:00+01:00   NaN
2013-04-09 00:00:00+01:00   NaN
2013-04-10 00:00:00+01:00   NaN
2013-04-11 00:00:00+01:00   NaN
2013-04-12 00:00:00+01:00   NaN
2013-04-13 00:00:00+01:00   NaN
2013-04-14 00:00:00+01:00   NaN
2013-04-15 00:00:00+01:00   NaN
2013-04-16 00:00:00+01:00   NaN
2013-04-17 00:00:00+01:00   NaN
2013-04-18 00:00:00+01:00   NaN
2013-04-19 00:00:00+01:00   NaN
2013-04-20 00:00:00+01:00   NaN
2013-04-21 00:00:00+01:00   NaN
2013-04-22 00:00:00+01:00   NaN
2013-04-23 00:00:00+01:00   NaN
2013-04-24 00:00:00+01:00   NaN
2013-04-25 00:00:00+01:00   NaN
2013-04-26 00:00:00+01:00   NaN
2013-04-27 00:00:00+01:00   NaN
2013-04-28 00:00:00+01:00   NaN
2013-04-29 00:00:00+01:00   NaN
2013-04-30 00:00:00+01:00   NaN
2013-05-01 00:00:00+01:00   NaN
Freq: D, dtype: float64

@jreback
Copy link
Contributor

jreback commented Dec 20, 2014

@JackKelly you are showing a slightly different issue (the tests that is covered here passes). pls open another issue.

xref #8941 in the issue (and this issue)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants