Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: implement __finalize__ for resample et al. #5862

Closed
JackKelly opened this issue Jan 6, 2014 · 7 comments · Fixed by #5942

Comments

@JackKelly
Copy link
Contributor

commented Jan 6, 2014

I'm really salivating at the chance to use the ._metadata and __finalize__ mechanisms in pandas-0.13-dev to create my own subclass of DataFrame and to have metadata propagate after calling functions inherited from DataFrame like dropna(), resample() etc.

Using the latest 0.13-dev version of Pandas, I think I might have bumped into a small bug (although I'm not sure if this is a bug or not??):

._metadata propagates after calling .resample(rule='D') and dropna() but not after calling .resample(rule='D', how='max').

More details, including the full code of my subclass, are given under the "experiments" heading of this issue: nilmtk/nilmtk#83

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 6, 2014

can you put up a small reproducible example ?

actually suprised it works at all with resample (finalize has to be explicity called, and I don't think was put in)

@ghost ghost assigned jreback Jan 6, 2014

@JackKelly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 6, 2014

Sure, here's the small example:

import pandas as pd

class ElectricDataFrame(pd.DataFrame):
    def __init__(self, *args, **kwargs):
        super(ElectricDataFrame, self).__init__(*args, **kwargs)
        self._metadata = ['test']

    @property
    def _constructor(self):
        print("_constructor called")
        return ElectricDataFrame

edf = ElectricDataFrame([1,2,3,4,5], pd.date_range('2010', freq='D', periods=5))

Interactive:

In [5]: pd.__version__
Out[5]: '0.13.0-75-g7d9e9fa'

In [6]: edf._metadata
Out[6]: ['test']

In [7]: edf.dropna()._metadata
_constructor called
Out[7]: ['test']

In [8]: edf.resample(rule='D')._metadata
_constructor called
Out[8]: ['test']

In [9]: edf.resample(rule='D', how='max')._metadata
Out[9]: []

In [10]: edf.resample(rule='D', how='min')._metadata
Out[10]: []

In [11]: edf.resample(rule='D', how='mean')._metadata
Out[11]: []
@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 6, 2014

thanks....I think since its daily data already and how is defaulted ,then its actually returning the same frame (that's why it works). This is not-implemented (though trivial to do)....will do for 0.13.1

@JackKelly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 6, 2014

since its daily data already and how is defaulted ,then its actually returning the same frame (that's why it works)

ah, interesting!

This is not-implemented (though trivial to do)....will do for 0.13.1

wonderful, thanks loads ;)

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 6, 2014

you could easily do it in your sub though...

class ElectricDataFrame(pd.DataFrame):

       def resample(self, *args, **kwargs):
           return super(ElectricDataFrame, self).resample(*args, **kwargs).__finalize__(self)
@JackKelly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 6, 2014

you could easily do it in your sub though...

Good point, thank you

@JackKelly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 15, 2014

awesome; thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.