Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC Fix inconsistencies #26301

Merged

Conversation

Scowley4
Copy link
Contributor

@Scowley4 Scowley4 commented May 6, 2019

Fix a few inconsistencies in the docstrings.

Not sure how pandas community feels about these kinds of fixes (and this is my first PR to pandas). I just notice things that are different from how they occur most of the time and decided to fix them up.

Examples

Add 's' to headers

Example

to

Examples

Correct number of hyphens under headers

Yields
-------

to

Yields
------

Rewrite testcase in standard format

In [1]: c = pd.Categorical(list('aabca'))

to

>>> c = pd.Categorical(list('aabca'))

@codecov
Copy link

codecov bot commented May 7, 2019

Codecov Report

Merging #26301 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26301      +/-   ##
==========================================
- Coverage   91.98%   91.98%   -0.01%     
==========================================
  Files         175      175              
  Lines       52374    52374              
==========================================
- Hits        48178    48174       -4     
- Misses       4196     4200       +4
Flag Coverage Δ
#multiple 90.53% <ø> (ø) ⬆️
#single 40.72% <ø> (-0.15%) ⬇️
Impacted Files Coverage Δ
pandas/core/arrays/base.py 99.43% <ø> (ø) ⬆️
pandas/core/indexes/multi.py 95.62% <ø> (ø) ⬆️
pandas/core/indexes/category.py 98.61% <ø> (ø) ⬆️
pandas/core/arrays/datetimes.py 97.79% <ø> (ø) ⬆️
pandas/core/ops.py 91.82% <ø> (ø) ⬆️
pandas/core/indexes/period.py 92.1% <ø> (ø) ⬆️
pandas/core/arrays/categorical.py 95.96% <ø> (ø) ⬆️
pandas/io/excel/_util.py 87.32% <ø> (ø) ⬆️
pandas/core/strings.py 98.86% <ø> (ø) ⬆️
pandas/core/internals/blocks.py 94.07% <ø> (ø) ⬆️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94c8c94...dad78f2. Read the comment docs.

@codecov
Copy link

codecov bot commented May 7, 2019

Codecov Report

Merging #26301 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26301      +/-   ##
==========================================
- Coverage   92.04%   92.03%   -0.01%     
==========================================
  Files         175      175              
  Lines       52301    52301              
==========================================
- Hits        48141    48137       -4     
- Misses       4160     4164       +4
Flag Coverage Δ
#multiple 90.59% <ø> (ø) ⬆️
#single 40.73% <ø> (-0.15%) ⬇️
Impacted Files Coverage Δ
pandas/core/arrays/base.py 99.43% <ø> (ø) ⬆️
pandas/core/indexes/multi.py 95.62% <ø> (ø) ⬆️
pandas/core/indexes/category.py 98.61% <ø> (ø) ⬆️
pandas/core/arrays/datetimes.py 97.79% <ø> (ø) ⬆️
pandas/core/ops.py 94.69% <ø> (ø) ⬆️
pandas/core/indexes/period.py 92.1% <ø> (ø) ⬆️
pandas/core/groupby/groupby.py 97.24% <ø> (ø) ⬆️
pandas/core/arrays/categorical.py 95.96% <ø> (ø) ⬆️
pandas/io/excel/_util.py 87.32% <ø> (ø) ⬆️
pandas/core/strings.py 98.86% <ø> (ø) ⬆️
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7bd7fab...77c5a3e. Read the comment docs.

@jreback jreback added the Docs label May 7, 2019
@jreback
Copy link
Contributor

jreback commented May 7, 2019

@WillAyd @datapythonista doesn't the doc-string validator catch the invalid underlines?

@WillAyd
Copy link
Member

WillAyd commented May 7, 2019

I don't see anything currently for underlines. Worth taking a look at; not sure how numpydoc would even parse that

I am surprised some of these weren't raising GL06 for "unknown section". This is definitely part of the CI and should be raised when the section is called say "Example" instead of "Examples"

@Scowley4
Copy link
Contributor Author

Scowley4 commented May 7, 2019

@jreback @WillAyd

I'm not sure why these weren't caught before but it looks like these fixes allowed the Docstring validation test to see that the docstring sections were in the wrong order. Maybe the ones that were out of order and had incorrect number of underlines were not be considered sections before, so they were not considered out of order.

I'm also not sure where I would look at the Docstring validation to confirm this is what is happening... but should I go ahead and reorder them as part of this pr as well?

@WillAyd
Copy link
Member

WillAyd commented May 7, 2019

Yea might as well reorder to get rid of any and all failures. FWIW the validator is located in scripts/validate_docstrings.py if you wanted to see how it works

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thank for taking care of all those @Scowley4

Not sure if there is something wrong with the validation of these. GL06 is being checked in the CI, and should report unknown headers like Example, Return.... But I think all the fixed ones are private, and I think we don't run the validation over them.

Having too many hyphens it does happen in public docstrings. In this case I think our validation gets the docstring already parsed from numpydoc, and I'm not sure if numpydoc is happy when there are more or less hyphens.

@Scowley4 do you want to do some research, and open an issue proposing what you think it's best?

@datapythonista
Copy link
Member

Just saw that the CI is red, the problem is that the fixed sections are now detected, and they are in the wrong position with respect to the other sections. See:

pandas/core/indexes/multi.py(122,): error GL07: pandas.MultiIndex: Sections are in the wrong order. Correct order is: Parameters, Attributes, Methods, See Also, Notes, Examples
pandas/core/indexes/multi.py(2347,): error GL07: pandas.MultiIndex.get_loc: Sections are in the wrong order. Correct order is: Parameters, Returns, See Also, Notes, Examples
pandas/core/indexes/multi.py(2447,): error GL07: pandas.MultiIndex.get_loc_level: Sections are in the wrong order. Correct order is: Parameters, Returns, See Also, Examples
pandas/core/indexes/datetimes.py(95,): error GL07: pandas.DatetimeIndex: Sections are in the wrong order. Correct order is: Parameters, Attributes, Methods, See Also, Notes
pandas/core/indexes/period.py(73,): error GL07: pandas.PeriodIndex: Sections are in the wrong order. Correct order is: Parameters, Attributes, Methods, See Also, Notes, Examples
Bash exited with code '5'.

So, looks like numpydoc is "ignoring" (not considering header sections) the ones with the wrong hyphens, which I think makes sense. I don't think we can detect wrong section names with wrong hyphens automatically, but we could add a validation that looks for the correct section names in the docstring, and checks if the section exists (if the section has been detected by numpydoc). I think if we check that the name of the section exists in the docstring in a single line, we shouldn't have false positives.

For example, we can have in the notes Parameters will be added in the future... and we don't want to fail because we found the word Parameters without the section Parameters. But if in a line just Parameters is found, we probably should fail.

@Scowley4 if you want to work on it, can you open an issue please? Otherwise I'll create it myself so someone else can work on it.

@Scowley4
Copy link
Contributor Author

Scowley4 commented May 7, 2019

@WillAyd Thanks for pointing me to the testing code.

@datapythonista Totally agree with the "ignoring section header" thing and have opened an issue about it that I would like to work on. #26307

@datapythonista
Copy link
Member

Changes look good, but CI is red. Not sure what's the problem, may be you just need to merge master, the error seems unrelated to the docstring changes.

@Scowley4
Copy link
Contributor Author

Scowley4 commented May 7, 2019

@datapythonista Was trying to figure out what the error was. Wondering if I accidentally started down a rabbit hole (One fix exposes more problems and more and more...).

Can you be a little more explicit about what you suggest I do? Did you mean for me to merge into my master branch?
(Sorry, still pretty new to working in opensource stuff)

@datapythonista
Copy link
Member

in you branch just do git fetch upstream && git merge upstream/master

this will let in your branch your changes, but applied to the last version of pandas, not the version when you created the branch

@Scowley4
Copy link
Contributor Author

Scowley4 commented May 7, 2019

Same error. Maybe now that the sections are actually sections and in the correct order, there is something wrong inside one and that is throwing the error. I haven't found the fix yet, though...

@Scowley4
Copy link
Contributor Author

Scowley4 commented May 7, 2019

As far as I can tell, the only place to_panel appears is here:

@property
def _constructor_expanddim(self):
"""Used when a manipulation result has one higher dimension as the
original, such as Series.to_frame() and DataFrame.to_panel()
"""
raise NotImplementedError

@Scowley4
Copy link
Contributor Author

Scowley4 commented May 7, 2019

I guess the above is just in pandas/pandas.

In the full repository:

generated/pandas.DataFrame.to_panel,../reference/api/pandas.DataFrame.to_panel

https://github.com/pandas-dev/pandas/blob/master/doc/source/reference/frame.rst

And two places in doc/source/whatsnew (v0.7.0 and v0.19.0)


It looks like this method was removed recently: #25047

@WillAyd
Copy link
Member

WillAyd commented May 7, 2019

@Scowley4 try merging master issue should have been fixed in #26308

Copy link
Member

@gfyoung gfyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datapythonista @WillAyd : if you'd like to double check before merging

@WillAyd
Copy link
Member

WillAyd commented May 8, 2019

@Scowley4 just had another PR go in which conflicts here - could you fix that up? Otherwise lgtm

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping on green

@datapythonista datapythonista merged commit 23b6115 into pandas-dev:master May 8, 2019
@datapythonista
Copy link
Member

thanks @Scowley4, nice clean up

@Scowley4 Scowley4 deleted the scowley4-fixInconsistencies branch May 8, 2019 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants