-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
DOC: update the pandas.core.groupby.GroupBy.max docstring #20073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…endible dictionary to do the same for other generic numeric operations in module pandas.core.groupby.
pandas/core/groupby.py
Outdated
| """ | ||
|
|
||
| _numeric_operations_examples = dict( | ||
| max="""Examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewers: If you think this design pattern is okay, please let me know if I can add examples for the other generic numeric operators in this module (i.e. sum, prod, min, first, last) in the same PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make this easier to edit, create the _numeric_operations_examples = {}
then each entry like
_numeric_operations['max'] = dedent(
.....
)
|
for groupby examples they should reference the Series (or DataFrame) method of the same name in See Also @jorisvandenbossche @datapythonista generic point. |
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice docstring, thanks!
Added extendible dictionary to do the same for other generic numeric operations in module pandas.core.groupby.
That seems a good idea
If you think this design pattern is okay, please let me know if I can add examples for the other generic numeric operators in this module
I would leave that for a separate PR.
pandas/core/groupby.py
Outdated
| """) | ||
|
|
||
| _numeric_operations_doc_template = """ | ||
| Compute %(f)s of group values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could maybe make a dict with "full" names (max -> the maximum, prod -> the product, ..), to give the first sentence a nicer read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also wondering, would "of each group" be clearer than "of group values" ?
| pandas.Series.%(name)s: groupby method of Series | ||
| pandas.DataFrame.%(name)s: groupby method of DataFrame | ||
| pandas.Panel.%(name)s: groupby method of Panel""" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as @jreback mentioned, can you add here pandas.Series/DataFrame.max as well? (I think it will be using %(f)s)
Maybe put those first
| Parameters | ||
| ---------- | ||
| kwargs : dict | ||
| Optional keyword arguments to pass to `%(f)s`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would list the valid keywords here .. Looking at where min/max/sum etc are create, it's only either min_count or either numeric_only
Codecov Report
@@ Coverage Diff @@
## master #20073 +/- ##
==========================================
+ Coverage 91.79% 91.8% +<.01%
==========================================
Files 152 152
Lines 49205 49208 +3
==========================================
+ Hits 45169 45174 +5
+ Misses 4036 4034 -2
Continue to review full report at Codecov.
|
pandas/core/groupby.py
Outdated
| _numeric_operations_doc_template = """ | ||
| Compute %(f)s of group values. | ||
| For multiple groupings, the result index will be a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
groupings -> groupers
pandas/core/groupby.py
Outdated
| """ | ||
|
|
||
| _numeric_operations_examples = dict( | ||
| max="""Examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make this easier to edit, create the _numeric_operations_examples = {}
then each entry like
_numeric_operations['max'] = dedent(
.....
)
pandas/core/groupby.py
Outdated
| -------- | ||
| Grouping by one column. | ||
| >>> df = pd.DataFrame({'A': 'a b b b'.split(), 'B': [1,2,2,3], 'C': [4,5,6,7]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Conventions for the examples in the guidelines states:
For more complex examples (groupping for example), avoid using data without interpretation, like a matrix of random numbers with columns A, B, C, D… And instead use a meaningful example
Therefore, this examples should be improved.
|
@shivam6294 do you have time to update the PR based on the feedback? |
|
Hey @jorisvandenbossche. I'm working on this right now |
…scription more readable. Used dedent where applicable.
| kwargs : dict | ||
| Optional keyword arguments to pass to `%(f)s`. | ||
| `numeric_only` : bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't find an example in the docs where kwargs were listed out, so I've used the same convention used for listing out normal parameters.
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice updates! Thanks
Added few more comments
| Optional keyword arguments to pass to `%(f)s`. | ||
| `numeric_only` : bool | ||
| Include only float, int, boolean columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be fine to actually list them as normal parameters (not inside the description of the 'kwargs').
I am only not sure if min_count is used for all. I think it is only used for sum and prod.
So not sure what the best way is here. Ideally this would also be substituted into the template, but that might get complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(given that this can get complicated, it is also fine to leave this for a separate issue/PR, and not solve it here directly)
| ) | ||
|
|
||
| _numeric_operations_see_also = dedent( | ||
| """See Also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to have the "See also" on the next line, otherwise dedent does not work.
But if this gives a blank line too much, I think you can do:
"""\
See Also
...
| pandas.DataFrame.%(f)s: compute %(f)s of values | ||
| pandas.Series.%(name)s: groupby method of Series | ||
| pandas.DataFrame.%(name)s: groupby method of DataFrame | ||
| pandas.Panel.%(name)s: groupby method of Panel""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can leave out Panel (it is deprecated)
|
Thanks for the review @jorisvandenbossche 👍 I'll create a new issue for the complicated template substitution. I had a question about the Parameters - listing the parameters like this: results in errors when I run the Is this alright? Also, should I still leave |
|
closing as stale. if you'd like to continue pls ping. |
Added extendible dictionary to do the same for other generic numeric operations in module pandas.core.groupby.
Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):
scripts/validate_docstrings.py pandas.core.groupby.GroupBy.maxgit diff upstream/master -u -- "*.py" | flake8 --diffpython doc/make.py --single pandas.core.groupby.GroupBy.maxPlease include the output of the validation script below between the "```" ticks:
If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.