Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupby().apply() returns different result depends on the first result is None or not. #12824

Closed
ruoyu0088 opened this issue Apr 8, 2016 · 1 comment
Labels
Bug Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@ruoyu0088
Copy link

The apply document says that it can:

apply can act as a reducer, transformer, or filter function, depending on exactly what is passed to apply. So depending on the path taken, and exactly what you are grouping. Thus the grouped columns(s) may be included in the output as well as set the indices.

So, I think it may discard the group if the callback function returns None. Here is two exmaple that works and not works:

import pandas as pd
import numpy as np

df = pd.DataFrame({"A":np.arange(10), "B":[1, 1, 1, 2, 3, 3, 3, 4, 4, 4]})
print(df.groupby("B").apply(lambda df2:None if df2.shape[0] <= 2 else df2.iloc[[0, -1]]))

df = pd.DataFrame({"A":np.arange(10), "B":[1, 2, 2, 2, 3, 3, 3, 4, 4, 4]})
print(df.groupby("B").apply(lambda x:None if x.shape[0] <= 2 else x.iloc[[0, -1]]))

the output is as following, the first one returns a DataFrame, the second one returns a Series with DataFrames inside:

     A  B
B        
1 0  0  1
  2  2  1
3 4  4  3
  6  6  3
4 7  7  4
  9  9  4


B
1         A    B
1  NaN  NaN
3  NaN  NaN
2                   A  B
1  1  2
3  3  2
3                   A  B
4  4  3
6  6  3
4                   A  B
7  7  4
9  9  4
dtype: object

The problem is that _wrap_applied_output() in groupby.py checks the first element to determine the concat method:

    if isinstance(values[0], DataFrame):
        return self._concat_objects(keys, values,
                                    not_indexed_same=not_indexed_same)

I want to know, does groupby().apply() support discarding groups?

@jreback
Copy link
Contributor

jreback commented Apr 8, 2016

yeah that prob should look at the non-Nones to infer what is going on. This is done in the other _wrap_applied_output for NDFrame so you can copy the same pattern.

want to do a pull-request?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants