Skip to content

closed='both' option for resample #2704

Open
@metakermit

Description

@metakermit

When I call resample to downsample a certain time series and aggregate the in-between values, sometimes I want both borders included in the aggregation bin (repeating the borders in two neighbouring bins).

This might be offered as a closed='both' option for resample.

An example when this might be necessary is e.g. numerical integration

from scipy import integrate
def ts_integrate(ts):
    print ts
    return integrate.simps(ts, ts.index.astype(np.int64) / 10**9)

ts2 = pd.Series(1, pd.date_range(start='2012-01-23', periods=12, freq='1s'))
ts3 = ts2.resample('3s', how = ts_integrate, closed='left')

Elements that enter the 1st bin are

2012-01-23 00:00:00    1
2012-01-23 00:00:01    1
2012-01-23 00:00:02    1

and 2012-01-23 00:00:03 ends up only in the 2nd bin. This results in the integral for the first bin equaling 2, whereas it should be 3 (the 00:00:02-00:00:03 interval isn't processed by anybody).

With the closed='both' option, bins would be

2012-01-23 00:00:00    1
2012-01-23 00:00:01    1
2012-01-23 00:00:02    1
2012-01-23 00:00:03    1

2012-01-23 00:00:03    1
2012-01-23 00:00:04    1
2012-01-23 00:00:05    1
2012-01-23 00:00:06    1

resulting in the wanted [3, 3, ...] integral values.

If the patch is not too difficult and someone points me in the right direction, I might try to do it myself.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions