Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing GL06 and GL07 errors for invalid docstrings #26307

Closed
1 of 4 tasks
Scowley4 opened this issue May 7, 2019 · 5 comments
Closed
1 of 4 tasks

Missing GL06 and GL07 errors for invalid docstrings #26307

Scowley4 opened this issue May 7, 2019 · 5 comments
Labels
Code Style Code style, linting, code_checks Docs

Comments

@Scowley4
Copy link
Contributor

Scowley4 commented May 7, 2019

As discovered in the process of working on #26301, there are invalid docstrings that should throw either GL06 - Found unknown section or GL07 - Sections are in the wrong order errors.

Without too much investigation, it's not clear why GL06 is not being thrown. Currently unsure if this is even a problem because, as @datapythonista mentioned in comments on #26301, we may not even run these checks on the private methods.

The missing GL07 are because section headers were only considered section headers if the following line had an equal number of -----'s (underlines).

This code is found here (where content[0] is the section header):

if (len(content) > 1
and len(content[0]) == len(content[1])
and set(content[1]) == {'-'}):
sections.append(content[0])

  • Reorder sections that are failing DOC Fix inconsistencies #26301 and merge
  • Fix GL07 to correctly throw errors with invalid underlines
  • Investigate GL06
  • Add error for valid section title words with invalid number of underlines

If it's okay with others, I'd like to take a crack at this issue.

@datapythonista
Copy link
Member

Not sure if I understand correctly, but to be clear on what I meant:

  • I think it's correct that GL06 unknown section is not raised when the docstrings are private. Great to fix them, but I don't think we want to validate private docstrings at all.
  • The GL07 wrong section order is working as expected, the sections were not detected before, so when checking the order of sections wasn't failing because some weren't there, and now that the section underlines are correct, the error is reported by the CI correctly. We just need to fix the order of the sections.

The only problem in the validation afaik is that we don't validate when sections have the wrong number of hyphens in the underline. We can do it by adding a new error GL10 - Section header not detected, and raising it if the docstring contains alone in a line a section name Examples, Parameters... but the section is not present in the list of sections returned by numpydoc.

@Scowley4
Copy link
Contributor Author

Scowley4 commented May 7, 2019

GL06 - I won't worry about validation not catching these errors then, though I've fixed several of them while I was at it.

GL07 - This makes sense. Because they weren't detected as section, they weren't detected as being in the wrong order.

I can definitely add something that checks to make sure we have the right number of hyphens in the case that the words are valid section title words.

On the other hand, I wonder if we should be checking for words that are close to the valid section words. For example, something like this

Return
------

would be caught because it's not a valid word and it has the correct number of hyphens.

But something like

Return
----

would not be caught because it isn't valid and it has an incorrect number of hyphens, thus (from the code above), this wouldn't even be identified as a section title.

Maybe we still throw an error if the section title is within some edit distance of a correct one and it has a line with only hyphens under it? Other ideas?

@datapythonista
Copy link
Member

Agree on what you say. I'd ignore your second example, the solution will overcomplicate things. Your first example is already being checked in public docstrings, and we don't want to validate private ones at this point (would be easy and it's good to have everything neat, but with the amount of work on the public docs better to stay focused).

So, the only case I'd add is:

Returns
-----

I'd just check line by line of the docstrings and see if it's a section name, and then see if numpydoc detected it as a section. That's quite simple, and once implemented we can easily see in the CI whether there are false positives, or what errors are being detected with this simple approach.

@gfyoung gfyoung added Docs Code Style Code style, linting, code_checks labels May 8, 2019
@jbrockmendel
Copy link
Member

@datapythonista GL06 and GL07 look like they're checked for in code_checks; is this issue still relevant?

@datapythonista
Copy link
Member

I think this can be closed, I think the only pending point should be implemented in numpydoc if there is interest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks Docs
Projects
None yet
Development

No branches or pull requests

4 participants