Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Example of Dataframe given in SeriesGroupBy.transform #49901

Open
1 task done
ramvikrams opened this issue Nov 24, 2022 · 12 comments
Open
1 task done

DOC: Example of Dataframe given in SeriesGroupBy.transform #49901

ramvikrams opened this issue Nov 24, 2022 · 12 comments
Labels

Comments

@ramvikrams
Copy link
Contributor

ramvikrams commented Nov 24, 2022

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.SeriesGroupBy.transform.html

Documentation problem

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
...                           'foo', 'bar'],
...                    'B' : ['one', 'one', 'two', 'three',
...                           'two', 'two'],
...                    'C' : [1, 5, 5, 2, 5, 5],
...                    'D' : [2.0, 5., 8., 1., 2., 9.]})
>>> grouped = df.groupby('A')[['C', 'D']]
>>> grouped.transform(lambda x: (x - x.mean()) / x.std())
          C         D
0 -1.154701 -0.577350
1  0.577350  0.000000
2  0.577350  1.154701
3 -1.154701 -1.000000
4  0.577350 -0.577350
5  0.577350  1.000000

This is the current example in SeriesGroupBy.tranform

Suggested fix for documentation

It should be changed with
grouped: pd.Series=df.groupby('A')[['C', 'D']]
just to make it clear it becomes hard for first timers

@ramvikrams ramvikrams added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 24, 2022
@bang128
Copy link
Contributor

bang128 commented Nov 29, 2022

@ramvikrams Do you mean we should change >>> grouped = df.groupby('A')[['C', 'D']]
to grouped: pd.Series=df.groupby('A')[['C', 'D']]?
But I think it should be something like grouped = pd.Series(df.groupby('A')[['C', 'D']])

@ramvikrams
Copy link
Contributor Author

Actually the doc is correct we just need to make it more explicit
by the way i don't think grouped = pd.Series(df.groupby('A')[['C', 'D']]) this is the right syntax

@bang128
Copy link
Contributor

bang128 commented Nov 29, 2022

@ramvikrams So can I comment out the suggested fix?
Something like: >>> grouped = df.groupby('A')[['C', 'D']] // grouped: pd.Series = df.groupby('A')[['C', 'D']]

@ramvikrams
Copy link
Contributor Author

Yes it seems right but I also just want to confirm from a member of pandas after that you can continue

@bang128
Copy link
Contributor

bang128 commented Nov 29, 2022

Thank you for your clarification!

@MarcoGorelli
Copy link
Member

grouped wouldn't be of type Series in this example though

In [6]: >>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
   ...: ...                           'foo', 'bar'],
   ...: ...                    'B' : ['one', 'one', 'two', 'three',
   ...: ...                           'two', 'two'],
   ...: ...                    'C' : [1, 5, 5, 2, 5, 5],
   ...: ...                    'D' : [2.0, 5., 8., 1., 2., 9.]})
   ...: >>> grouped = df.groupby('A')[['C', 'D']]

In [7]: grouped
Out[7]: <pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f334eb0aa60>

@ramvikrams
Copy link
Contributor Author

grouped wouldn't be of type Series in this example though

yes I understood that but I feel this example needs clarity for as a beginner reading it

@bang128
Copy link
Contributor

bang128 commented Nov 29, 2022

I will take this issue

@ramvikrams
Copy link
Contributor Author

ramvikrams commented Nov 29, 2022

but I have a question that is the return type should be of Series or it is fine being DataFrame , because in stubs the return type is Series for SeriesGroupBy.transform()

@bang128
Copy link
Contributor

bang128 commented Nov 29, 2022

I've just looked at the documentation for DataFrame.groupby() and DataFram.transform(), I feel that grouped is actually unrelated to pd.Series in this case.

@MarcoGorelli
Copy link
Member

in that example, grouped is DataFrameGroupby. The docs would probably be improved by having an example with SeriesGroupby instead

@ramvikrams
Copy link
Contributor Author

ramvikrams commented Nov 29, 2022

yeah

@MarcoGorelli MarcoGorelli removed the Needs Triage Issue that has not been reviewed by a pandas team member label Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants