Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select() within a function closure not working as agg function #1423

Closed
dalejung opened this issue Jun 7, 2012 · 2 comments

Comments

Projects
None yet
2 participants
@dalejung
Copy link
Contributor

commented Jun 7, 2012

I'm running into a weird issue with groupby and function closure. For some reason the function closure doesn't work unless I access the grouped series. You can see in agg_before I have a fix flag that will just access the data var.

from pandas import *                                                                                  
import numpy as np                                                                                    

periods = 1000                                                                                        
ind = DatetimeIndex(start='2012/1/1', freq='5min', periods=periods)                                   
df = DataFrame({'high': np.arange(periods), 'low': np.arange(periods)}, index=ind)                    

def agg_before(hour, func, fix=False):                                                                
    """                                                                                               
        Run an aggregate func on the subset of data.                                                  
    """                                                                                               
    def _func(data):                                                                                  
        d = data.select(lambda x: x.hour < 11).dropna()                                               
        if fix:                                                                                       
            data[data.index[0]]                                                                       
        if len(d) == 0:                                                                               
            return None                                                                               
        return func(d)                                                                                
    return _func                                                                                      

def afunc(data):                                                                                      
    d = data.select(lambda x: x.hour < 11).dropna()                                                   
    return np.max(d)                                                                                  

grouped = df.groupby(lambda x: datetime(x.year, x.month, x.day))                                      

closure_bad = grouped.agg({'high': agg_before(11, np.max)})                                           
closure_good = grouped.agg({'high': agg_before(11, np.max, True)})                                    
lambda_good = grouped.agg({'high': afunc})                         
In [33]: np.__version__
Out[39]: '1.6.2'

In [34]: pandas.__version__
Out[34]: '0.8.0.dev-dc6ce90'

In [35]: closure_bad
Out[35]: 
            high
2012-01-01   131
2012-01-02   NaN
2012-01-03   NaN
2012-01-04   NaN

In [36]: closure_good
Out[36]: 
            high
2012-01-01   131
2012-01-02   419
2012-01-03   707
2012-01-04   995

In [37]: lambda_good
Out[37]: 
            high
2012-01-01   131
2012-01-02   419
2012-01-03   707
2012-01-04   995

Running an agg function that isn't a closure works fine. Any ideas on this?

@wesm

This comment has been minimized.

Copy link
Member

commented Jun 11, 2012

Hey @dalejung thanks for tracking this down and the test case. I found the issue and it's been fixed, will be in 0.8.0

@dalejung

This comment has been minimized.

Copy link
Contributor Author

commented Jun 11, 2012

@wesm np. Was definitely a fun one to stumble across.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.