New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy conventions #226

Merged
merged 23 commits into from Jan 20, 2017

Conversation

Projects
None yet
5 participants
@shacharoo
Copy link
Member

shacharoo commented Dec 24, 2016

No description provided.


@expect(_D213)
@expect("D408: Section underline should be in the line following the "
"section\'s name ('Returns')")

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

No need to escape the single quote.

This comment has been minimized.

@shacharoo

shacharoo Dec 31, 2016

Member

That is, in fact, true.

'{0!r}')
D409 = D4xx.create_error('D409', 'Section underline should match the length '
'of its name',
'len({0!r}) == {1!r}, got {2!r} dashes')

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

Maybe phrase this better without the len?

This comment has been minimized.

@shacharoo

shacharoo Dec 31, 2016

Member

Rephrased to "Expected x dashes in section 'y', got z".

For example, if `line` is " Hello world!!!", returns "Hello world".
"""
result = re("[A-Za-z ]+").match(line.strip())

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

This is a bit simplistic. What about accents (Über) or other languages?

This comment has been minimized.

@shacharoo

shacharoo Dec 31, 2016

Member

The numpy convention only allows the section names specified above, no accents there. I could change the method's name or documentation if it will be clearer.

This is done by checking the following conditions:
* Does the current line has a suffix after the suspected section name?
* Is the previous line not empty?

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

If you only consider a line as a section name if the previous line is empty, when do you return D411?

This comment has been minimized.

@shacharoo

shacharoo Dec 31, 2016

Member

I changed the conditions a bit, it is clearer now.


return not (section_name_suffix != '' and not
prev_line_ends_with_punctuation and not
context.previous_line.strip() == '')

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

There are about 3 layers of negation here. I had to write it down to understand (I hope) what's happening. How about:

is_section = (not section_name_suffix or 
              prev_line_ends_with_punctuation or
              is_blank(context.previous_line)

return is_section

Also, the docstring should be clearer that all conditions should be satisfied for the line to not be considered a section name.

This comment has been minimized.

@shacharoo
next_non_empty_line_offset = 0

for line in context.following_lines:
line_set = ''.join(set(line.strip()))

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

Notice that you can use is_blank the check if a line contains only whitespace.

This comment has been minimized.

@shacharoo

shacharoo Dec 31, 2016

Member

I used it anywhere i tested that. This line is not the case, i want to see that it has only dashes in it.

yield violations.D214(capitalized_section)

suffix = context.line.strip().lstrip(context.section_name)
if suffix != '':

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

Can be if suffix

This comment has been minimized.

@shacharoo

line_after_dashes = \
context.following_lines[next_non_empty_line_offset + 1]
if line_after_dashes.strip() != '':

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

Can be is_blank(line_after_dashes).

This comment has been minimized.

@shacharoo
suspected_section_indices = [i for i, line in enumerate(lines) if
_suspected_as_section(line)]

context = namedtuple('SectionContext', ('section_name',

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

When using namedtuple, the assigned name should be the name of the type, e.g.:

SectionContext = namedtuple('SectionContext', ...)

This comment has been minimized.

@shacharoo
@@ -434,7 +434,8 @@ def test_illegal_convention(env):
_, err, code = env.invoke('--convention=illegal_conv')
assert code == 2, err
assert "Illegal convention 'illegal_conv'." in err
assert 'Possible conventions: pep257' in err
assert 'Possible conventions' in err
assert 'pep257' in err

This comment has been minimized.

@Nurdok

Nurdok Dec 28, 2016

Member

Test that numpy is also there?

This comment has been minimized.

@shacharoo
@Nurdok

This comment has been minimized.

Copy link
Member

Nurdok commented Dec 29, 2016

General comment: I would like to take the functions that deal with the numpy sections out to a different file. Any ideas on how to perform that separation?

@shacharoo

This comment has been minimized.

Copy link
Member

shacharoo commented Dec 31, 2016

I thought about splitting it to 2 parts:

  • A check engine - the part that iterates the errors (the check_source method in the checker class)
  • Convention checkers - a class for each convention that we support.

Thing is, pydocstyle doesn't check specificly PEP257 or Numpy - there are error codes that are not in either, such as D203, D212, and D213. This means that in that design we will have to have an auxiliary class that will hold the methods that check those error codes.

@Nurdok , your opinion?

@Nurdok

Nurdok approved these changes Jan 14, 2017

@Nurdok

This comment has been minimized.

Copy link
Member

Nurdok commented Jan 14, 2017

@farmersez I think having these in one file is good enough for now.
@sigmavirus24, this is a big change, do you have time to take a look before I merge this?

@shacharoo

This comment has been minimized.

Copy link
Member

shacharoo commented Jan 14, 2017

I still haven't updated the documentation, working on it as we speak.

@Nurdok

Nurdok approved these changes Jan 20, 2017

@Nurdok Nurdok merged commit 633031b into PyCQA:master Jan 20, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@shacharoo shacharoo deleted the shacharoo:numpy-conventions branch Jan 20, 2017

@cdeil

This comment has been minimized.

Copy link

cdeil commented Jan 22, 2017

@farmersez - Thank you very much!

I still haven't updated the documentation, working on it as we speak.

Having some docs for this would be great!
Still, I'll try it out now and see if it gives useful results for projects I'm interested in.
Did you try running it on Numpy?

@jnothman

This comment has been minimized.

Copy link

jnothman commented Jan 23, 2017

This looks interesting, @farmersez. However, in practice what we need in scikit-learn and other numpydoc users are checks for parameter listings:

  • parameters, their type and their description has quite a particular formatting
  • scikit-learn seems to often include a blank line between parameter descriptions (I wish we didn't), in contrast with other numpydoc users.
  • it would be really valuable to be able to compare a docstring parameter list with an expected parameter list (and indeed to patch one into the other), but perhaps this isn't directly the role of the linter.
  • it's a real pity we can't get line numbers within the docstring, although I realise that in some cases this is impossible.
@thisch

This comment has been minimized.

Copy link

thisch commented Apr 18, 2017

ping @farmersez. Can you comment on the previous two comments?

@shacharoo

This comment has been minimized.

Copy link
Member

shacharoo commented Apr 19, 2017

Sorry for the really really really late response 😊

@cdeil, I ran it on the numpy example file (after you caught the D410 bug 😋), not on the whole project or projects that use numpy.

And to follow up @jnothman 's remarks:

  • Currently, pydocstyle only checks that the section exists and that the header is formatted correctly (for any type of section).
  • That's not a problem, you could use convention=numpy with add-ignore=D410 in your project config file.
  • Agreed.
  • Agreed, I'd be happy to see actual line numbers myself.

When I started the numpy validation feature I tried to do all that - pass function parameters to the docstring validation code and check its structure, but in my opinion it got too entangled and was not worth the trouble at the time. I do intend (and/or encourage you) to add actual content validation to sections.

Having said that, the validation I'm talking about is strictly for numpy. If your project has a different documentation style for sections perhaps we should consider a way to allow external validation code running within pydocstyle, though I don't believe that's a thing worth doing. @Nurdok? entry points maybe? 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment