loffset doesn't work in resample if used with agg() #13218

Closed
marcomayer opened this Issue May 18, 2016 · 5 comments

Comments

Projects
None yet
2 participants

marcomayer commented May 18, 2016 edited

Code Sample, a copy-pastable example if possible

# Create DF
df = pd.DataFrame(np.random.rand(5,2), columns=list('AB'), index=pd.date_range('2010-01-01 09:00:00', periods=5, freq='s'))

print(df)
                                       A                    B
2010-01-01 09:00:00             0.283113             0.559642
2010-01-01 09:00:01             0.754942             0.621557
2010-01-01 09:00:02             0.102002             0.892100
2010-01-01 09:00:03             0.885400             0.524359
2010-01-01 09:00:04             0.324761             0.706758

#Resample, loffset works fine using mean(), is offset by 2 hours.
print(df.resample('2s', loffset='2h').mean())
                                       A                    B
2010-01-01 11:00:00             0.519028             0.590599
2010-01-01 11:00:02             0.493701             0.708229
2010-01-01 11:00:04             0.324761             0.706758


# Resample with agg(), loffset doesn't work anymore.
print(df.resample('2s', loffset='2h').agg(dict(A='sum', B='mean')))
                                      A                    B
2010-01-01 09:00:00             1.038055             0.590599
2010-01-01 09:00:02             0.987402             0.708229
2010-01-01 09:00:04             0.324761             0.706758

Expected Output

I'd expect the same as with mean(), this is how resample worked in the past with resample(how=...).

But maybe I also misunderstood something about the change. If so please enlighten me.

Since I need this to keep things going, I use the following workaround for now, please let me know if this is the way to go or if there's a more efficient way:

print(df.resample('2s', loffset='2h').agg(dict(A='sum', B='mean')).tshift(2, 'h'))

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 20.3
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.1
statsmodels: 0.6.1
xarray: None
IPython: 4.2.0
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: 0.2.1
In [92]:

Contributor

jreback commented May 18, 2016

hmm this was fixed for defined functions (e.g. .count/mean) in pydata#12757

prob not tested for aggregations; should be straightforward though.

want to do a pull-request?

jreback added this to the 0.18.2 milestone May 18, 2016

I'd love to Jeff but I'm already back on schedule on my current project and will be busy for at least another half day to get all my stuff updated to work with 0.18.x - I'm a heavy resample() and rolling() user :/

Contributor

jreback commented May 18, 2016

@marcomayer no hurry :)

Okay I've put it on my todo-list ;)

Contributor

jreback commented May 18, 2016

gr8!

@jreback jreback modified the milestone: Next Major Release, 0.19.0 Aug 13, 2016

@jreback jreback modified the milestone: 0.20.0, Next Major Release Dec 31, 2016

jreback closed this in b2cdc02 Dec 31, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment