Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: groupby.resample *maybe* can return a deferred operation #12486

Closed
jreback opened this issue Feb 27, 2016 · 0 comments
Closed

API: groupby.resample *maybe* can return a deferred operation #12486

jreback opened this issue Feb 27, 2016 · 0 comments
Labels
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Feb 27, 2016

xref #12448 / #12449

and on SO

In [1]:         df = DataFrame({'date': pd.date_range(start='2016-01-01',
   ...:                                               periods=4,
   ...:                                               freq='W'),
   ...:                         'group': [1, 1, 2, 2],
   ...:                         'val': [5, 6, 7, 8]}).set_index('date')

In [2]: df
Out[2]: 
            group  val
date                  
2016-01-03      1    5
2016-01-10      1    6
2016-01-17      2    7
2016-01-24      2    8

This replicates 0.17.1 (something slightly off with it including the grouper column)

In [3]: df.groupby('group').apply(lambda x: x.resample('1D').ffill())[['val']]
Out[3]: 
                  val
group date           
1     2016-01-03    5
      2016-01-04    5
      2016-01-05    5
      2016-01-06    5
      2016-01-07    5
      2016-01-08    5
      2016-01-09    5
      2016-01-10    6
2     2016-01-17    7
      2016-01-18    7
      2016-01-19    7
      2016-01-20    7
      2016-01-21    7
      2016-01-22    7
      2016-01-23    7
      2016-01-24    8

# ideally this would work. Its possible but requires some intelligently filling according to each group level.
In [4]: df.groupby('group').resample('1D').ffill()
Out[4]: 
            group  val
date                  
2016-01-03      1    5
2016-01-10      1    6
2016-01-17      2    7

A pure asfreq operation

data = [['2010-01-01', 'A', 2], ['2010-01-02', 'A', 3], ['2010-01-05', 'A', 8], 
        ['2010-01-10', 'A', 7], ['2010-01-13', 'A', 3], ['2010-01-01', 'B', 5], 
        ['2010-01-03', 'B', 2], ['2010-01-04', 'B', 1], ['2010-01-11', 'B', 7], 
        ['2010-01-14', 'B', 3]]

df = pd.DataFrame(data, columns=['Date', 'ID', 'Score'])
df.Date = pd.to_datetime(df.Date)

In [27]: df.groupby('ID').apply(lambda x: x.set_index('Date').Score.resample('D').asfreq())
Out[27]: 
ID  Date      
A   2010-01-01    2.0
    2010-01-02    3.0
    2010-01-03    NaN
    2010-01-04    NaN
    2010-01-05    8.0
    2010-01-06    NaN
    2010-01-07    NaN
    2010-01-08    NaN
    2010-01-09    NaN
    2010-01-10    7.0
    2010-01-11    NaN
    2010-01-12    NaN
    2010-01-13    3.0
B   2010-01-01    5.0
    2010-01-02    NaN
    2010-01-03    2.0
    2010-01-04    1.0
    2010-01-05    NaN
    2010-01-06    NaN
    2010-01-07    NaN
    2010-01-08    NaN
    2010-01-09    NaN
    2010-01-10    NaN
    2010-01-11    7.0
    2010-01-12    NaN
    2010-01-13    NaN
    2010-01-14    3.0
Name: Score, dtype: float64

Would be nice for this to work

df.groupby(['ID',pd.Grouper(key='Date',freq='D')]).asfreq()
@jreback jreback added this to the 0.18.1 milestone Feb 27, 2016
@jreback jreback changed the title BUG: groupby.resample *maybe* can return a deferred operation API: groupby.resample *maybe* can return a deferred operation Feb 27, 2016
jreback added a commit to jreback/pandas that referenced this issue Apr 21, 2016
closes pandas-dev#12738

BUG: allow df.groupby(...).resample(...) to return a Resampler groupby object

closes pandas-dev#12486

BUG: consistency of name of returned groupby

closes pandas-dev#12363
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant