Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrameGroupBy.transform with axis=1 fails #36308

Closed
rhshadrach opened this issue Sep 12, 2020 · 8 comments · Fixed by #36350
Closed

BUG: DataFrameGroupBy.transform with axis=1 fails #36308

rhshadrach opened this issue Sep 12, 2020 · 8 comments · Fixed by #36350
Assignees
Labels
Milestone

Comments

@rhshadrach
Copy link
Member

The following seems to happen on all transform groupby kernels:

df = DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df.groupby([1, 1], axis=1).transform("shift")

results in ValueError: Length mismatch: Expected axis has 2 elements, new values have 3 elements.

I think the issue is in pandas.core.groupby.generic._wrap_transformed_output; this method does not take into account self.axis when wrapping the output. All that needs to happen is result = result.T and to use the index labels rather than the column labels for columns there.

@rhshadrach rhshadrach added this to the Contributions Welcome milestone Sep 12, 2020
@ethanator
Copy link
Contributor

Could I work on this issue?

@xxsacxx
Copy link

xxsacxx commented Sep 12, 2020

I tried the same example with :
df.groupby(['val']*df.shape[1],axis=1).transform('shift')
which failed with
Shape of passed values is (2, 3), indices imply (3, 3)

while
df.groupby(['val']*df.shape[1],axis=1).transform(lambda x:x.shift())

gives :
A B
0 NaN 1.0
1 NaN 2.0
2 NaN 3.0

what output are we expecting here ?

@rhshadrach
Copy link
Member Author

@ethanator Absolutely! Simply make a comment here "take" and github will assign it to you. If you encounter any difficulties, feel free to reach out here.

@xxsacxx Your 2nd result looks correct to me; the values are shifted 1 to the right, and since there are no values to the left of the first column, you get NaN. The presence of NaN then coerces the dtype to being a float.

@ethanator
Copy link
Contributor

take

@purusharthmalik
Copy link

take

1 similar comment
@aks861999
Copy link

take

@tannu-singh
Copy link

take

@adhilcodes
Copy link

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants