Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: implement __finalize__ for resample et al. #5862

Closed
JackKelly opened this issue Jan 6, 2014 · 7 comments · Fixed by #5942
Closed

API: implement __finalize__ for resample et al. #5862

JackKelly opened this issue Jan 6, 2014 · 7 comments · Fixed by #5942
Assignees
Labels
Milestone

Comments

@JackKelly
Copy link
Contributor

I'm really salivating at the chance to use the ._metadata and __finalize__ mechanisms in pandas-0.13-dev to create my own subclass of DataFrame and to have metadata propagate after calling functions inherited from DataFrame like dropna(), resample() etc.

Using the latest 0.13-dev version of Pandas, I think I might have bumped into a small bug (although I'm not sure if this is a bug or not??):

._metadata propagates after calling .resample(rule='D') and dropna() but not after calling .resample(rule='D', how='max').

More details, including the full code of my subclass, are given under the "experiments" heading of this issue: nilmtk/nilmtk#83

@jreback
Copy link
Contributor

jreback commented Jan 6, 2014

can you put up a small reproducible example ?

actually suprised it works at all with resample (finalize has to be explicity called, and I don't think was put in)

@ghost ghost assigned jreback Jan 6, 2014
@JackKelly
Copy link
Contributor Author

Sure, here's the small example:

import pandas as pd

class ElectricDataFrame(pd.DataFrame):
    def __init__(self, *args, **kwargs):
        super(ElectricDataFrame, self).__init__(*args, **kwargs)
        self._metadata = ['test']

    @property
    def _constructor(self):
        print("_constructor called")
        return ElectricDataFrame

edf = ElectricDataFrame([1,2,3,4,5], pd.date_range('2010', freq='D', periods=5))

Interactive:

In [5]: pd.__version__
Out[5]: '0.13.0-75-g7d9e9fa'

In [6]: edf._metadata
Out[6]: ['test']

In [7]: edf.dropna()._metadata
_constructor called
Out[7]: ['test']

In [8]: edf.resample(rule='D')._metadata
_constructor called
Out[8]: ['test']

In [9]: edf.resample(rule='D', how='max')._metadata
Out[9]: []

In [10]: edf.resample(rule='D', how='min')._metadata
Out[10]: []

In [11]: edf.resample(rule='D', how='mean')._metadata
Out[11]: []

@jreback
Copy link
Contributor

jreback commented Jan 6, 2014

thanks....I think since its daily data already and how is defaulted ,then its actually returning the same frame (that's why it works). This is not-implemented (though trivial to do)....will do for 0.13.1

@JackKelly
Copy link
Contributor Author

since its daily data already and how is defaulted ,then its actually returning the same frame (that's why it works)

ah, interesting!

This is not-implemented (though trivial to do)....will do for 0.13.1

wonderful, thanks loads ;)

@jreback
Copy link
Contributor

jreback commented Jan 6, 2014

you could easily do it in your sub though...

class ElectricDataFrame(pd.DataFrame):

       def resample(self, *args, **kwargs):
           return super(ElectricDataFrame, self).resample(*args, **kwargs).__finalize__(self)

@JackKelly
Copy link
Contributor Author

you could easily do it in your sub though...

Good point, thank you

@JackKelly
Copy link
Contributor Author

awesome; thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants