Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking examples due to resample refactor #12448

Closed
jorisvandenbossche opened this issue Feb 25, 2016 · 2 comments
Closed

Breaking examples due to resample refactor #12448

jorisvandenbossche opened this issue Feb 25, 2016 · 2 comments
Labels
API Design Resample resample method
Milestone

Comments

@jorisvandenbossche
Copy link
Member

While using master a bit, I discovered some more cases where the new resample API breaks things:

  • Plotting. .plot is a dedicated groupby/resample method (which adds each group individually to the plot), while I think it is a very common idiom to quickly resample your timeseries and plot it with (old API) eg s.resample('D').plot().
    Example with master:

    In [1]: s = pd.Series(np.random.randn(60), index=date_range('2016-01-01', periods=60, freq='1min'))
    
    In [3]: s.resample('15min').plot()
    Out[3]:
    2016-01-01 00:00:00    Axes(0.125,0.1;0.775x0.8)
    2016-01-01 00:15:00    Axes(0.125,0.1;0.775x0.8)
    2016-01-01 00:30:00    Axes(0.125,0.1;0.775x0.8)
    2016-01-01 00:45:00    Axes(0.125,0.1;0.775x0.8)
    Freq: 15T, dtype: object
    

    figure_1

    while previously it would just have given you one continuous line.
    This one can be solved I think by special casing plot for resample (not have it a special groupby-like method, but let it warn and pass the the resample().mean() result to Series.plot() like the 'deprecated_valids')

  • When you previously called a method on the resample result that is also a valid Resampler method now. Eg s.resample(freq).min() would previously have given you the "minimum daily average" while now it will give you the "minimum per day".
    This one is more difficult/impossible to solve I think? As you could detect that case if you know it is old code, but cannot distinguish it from perfectly valid code with the new API. If we can't solve it, I think it deserves some mention in the whatsnew explanation.

  • Using resample on a groupby object (xref Resampling converts int to float, but only in group by #12202). Using the example of that issue, with 0.17.1 you get:

    In [1]: df = pd.DataFrame({'date': pd.date_range(start='2016-01-01', periods=4,
    freq='W'),
    ...:                'group': [1, 1, 2, 2],
    ...:                'val': [5, 6, 7, 8]})
    
    In [2]: df.set_index('date', inplace=True)
    
    In [3]: df
    Out[3]:
          group  val
    date
    2016-01-03      1    5
    2016-01-10      1    6
    2016-01-17      2    7
    2016-01-24      2    8
    
    In [4]: df.groupby('group').resample('1D', fill_method='ffill')
    Out[4]:
                    val
    group date
    1     2016-01-03    5
      2016-01-04    5
      2016-01-05    5
      2016-01-06    5
      2016-01-07    5
      2016-01-08    5
      2016-01-09    5
      2016-01-10    6
    2     2016-01-17    7
      2016-01-18    7
      2016-01-19    7
      2016-01-20    7
      2016-01-21    7
      2016-01-22    7
      2016-01-23    7
      2016-01-24    8
    
    In [5]: pd.__version__
    Out[5]: u'0.17.1'
    

    while with master you get:

    In [29]: df.groupby('group').resample('1D', fill_method='ffill')
    Out[29]: <pandas.core.groupby.DataFrameGroupBy object at 0x0000000009BA73C8>
    

    which will give you different results/error with further operations on that. Also, this case does not raise any FutureWarning (which should, as the user should adapt the code to groupby().resample('D').ffill())

@jreback jreback added API Design Resample resample method labels Feb 25, 2016
@jreback jreback added this to the 0.18.0 milestone Feb 25, 2016
@jreback
Copy link
Contributor

jreback commented Feb 25, 2016

ok

  1. just need to define .plot(...) on the Resampler to actually call .mean().plot(...) (and have a nice warning message)

similar / better than this

In [5]: s.resample('15min',how='sum').plot()
/Users/jreback/miniconda/bin/ipython:1: FutureWarning: how in .resample() is deprecated
the new syntax is .resample(...).sum()
  #!/bin/bash /Users/jreback/miniconda/bin/python.app
Out[5]: <matplotlib.axes._subplots.AxesSubplot at 0x11bd82450>
  1. need to provide warnings here as well (this is handled by the .groupby(...).resample(...) which actually calls things, but hits a different path

@jreback
Copy link
Contributor

jreback commented Feb 26, 2016

  1. I think you are right, have to doc this. It should break code loudly though as the previous API would return a scalar, this will return a Series.

jreback added a commit to jreback/pandas that referenced this issue Mar 8, 2016
make sure .resample(...).plot() warns and returns a correct plotting object
make sure that .groupby(...).resample(....) is hitting warnings when appropriate

closes pandas-dev#12448
@jreback jreback closed this as completed in 14cf67f Mar 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Resample resample method
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants