Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupby.Transform Aggregation into Collections returns individual elements (unless added or string concatenated) #18084

Open
yvanaquino opened this issue Nov 2, 2017 · 3 comments
Labels
Apply Apply, Aggregate, Transform Bug Groupby Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.).

Comments

@yvanaquino
Copy link

import pandas as pd

dat = {
    'user': ['Aaron', 'Aaron', 'Bob', 'Bob', 'Cindy', 'Cindy'],
    'likes': ['Apples', 'Bananas', 'Cherries', 'Dates', 'eggs', 'fruit'],
    'vals': [1, 2, 3, 4, 5, 6]
}
df = pd.DataFrame(dat)

# This doesn't work and returns a like indexed NDFrame with each individual element.
df.groupby('user')['likes'].transform(list) 
df.groupby('user')['likes'].transform(lambda s: list(s))
df.groupby('user')['likes'].transform(pd.Series.tolist)
df.groupby('user')['likes'].transform(lambda s: pd.Series.tolist(s))


# This worked - but ideally, shouldn't require this much effort.  There seems to be a problem with Transform and iterables in the new API (I kind think i remember this working in .18...)

def pd_xform_list(_iter):
    tmpList = []
    for x in _iter:
        tmpList.append(x)
    return str(tmpList)
    
df['groups'] = df.groupby('user')['likes'].transform(lambda k: pd_xform_list(k))
from ast import literal_eval

df['groups'] = df['groups'].apply(lambda st: literal_eval(st))
df

When attempting to transform (pd.groupby.transform) into collections in pandas, the like NDFrame object returns the individual elements instead of a collection. string to literal evaluation workaround must be implemented. This functions normally with custom aggregation (groupby.agg) - but does not work correctly with groupby.transform.

Regards,

@gfyoung gfyoung added the Groupby label Nov 3, 2017
@gfyoung
Copy link
Member

gfyoung commented Nov 3, 2017

@yvanaquino : Thanks for reporting this! I'm a little unclear about your issue. Are you just requesting that we return a collection instead of individual elements when calling .transform? Because using .agg I think would easily solve your problem as you so noted above.

@yvanaquino
Copy link
Author

Maybe I'm misunderstansing how to use DataFrame.transform() - shouldn't it return a broadcasted version as a likely indexed NDFrame of your aggregation?

In this example, I have to convert my list to a string and then apply literal evaluation from ast module to convert it back into an object.

Is there a way to handle the return a list with transform without numerical aggregation or string concatenation? IE - can it handle returning collections?

@jbrockmendel jbrockmendel added the Apply Apply, Aggregate, Transform label Oct 22, 2019
@mroeschke mroeschke added the Bug label Jun 28, 2020
@jbrockmendel jbrockmendel added the Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). label Sep 22, 2020
@javierquincke
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform Bug Groupby Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.).
Projects
None yet
Development

No branches or pull requests

5 participants