Description
When I call resample to downsample a certain time series and aggregate the in-between values, sometimes I want both borders included in the aggregation bin (repeating the borders in two neighbouring bins).
This might be offered as a closed='both'
option for resample
.
An example when this might be necessary is e.g. numerical integration
from scipy import integrate
def ts_integrate(ts):
print ts
return integrate.simps(ts, ts.index.astype(np.int64) / 10**9)
ts2 = pd.Series(1, pd.date_range(start='2012-01-23', periods=12, freq='1s'))
ts3 = ts2.resample('3s', how = ts_integrate, closed='left')
Elements that enter the 1st bin are
2012-01-23 00:00:00 1
2012-01-23 00:00:01 1
2012-01-23 00:00:02 1
and 2012-01-23 00:00:03
ends up only in the 2nd bin. This results in the integral for the first bin equaling 2, whereas it should be 3 (the 00:00:02
-00:00:03
interval isn't processed by anybody).
With the closed='both'
option, bins would be
2012-01-23 00:00:00 1
2012-01-23 00:00:01 1
2012-01-23 00:00:02 1
2012-01-23 00:00:03 1
2012-01-23 00:00:03 1
2012-01-23 00:00:04 1
2012-01-23 00:00:05 1
2012-01-23 00:00:06 1
resulting in the wanted [3, 3, ...]
integral values.
If the patch is not too difficult and someone points me in the right direction, I might try to do it myself.