New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

D401 implementation is very naive #68

Closed
adiroiban opened this Issue Apr 12, 2014 · 10 comments

Comments

Projects
None yet
6 participants
@adiroiban
Copy link

adiroiban commented Apr 12, 2014

I have the following docstring.

    @property
    def file_path(self):
        '''Alias for consistency with the rest of pocketlint.'''

The error is

     142: D401: First line should be imperative: 'Alia', not 'Alias'

While the intent is noble, I think that the current implementation is not helpful.

Is there anyone using this check in production code?

If a check could not be done, I prefer to not try to automate it as it is not fun to maintain a list of ignored values..

Thanks!

@jacebrowning

This comment has been minimized.

Copy link
Contributor

jacebrowning commented Aug 12, 2014

I have found this rule useful, but perhaps it should not apply to properties. All other method docstrings should be imperative (and I can't really think of any false positives), but property docstrings are typically not going to be imperative unless you do something like "Get the value of....".

@sigmavirus24

This comment has been minimized.

Copy link
Member

sigmavirus24 commented Aug 13, 2014

Actually, this might read better as """An alias for consistency with the rest of pocketlint."""

@jacebrowning

This comment has been minimized.

Copy link
Contributor

jacebrowning commented Dec 19, 2014

I found another good example of a false positive: """Focus the application."""

Perhaps pep257 could have a list of acceptable words that end in s?

@sigmavirus24

This comment has been minimized.

Copy link
Member

sigmavirus24 commented Dec 21, 2014

A whitelist would be unmaintainable in my opinion @jacebrowning

@glennmatthews

This comment has been minimized.

Copy link
Contributor

glennmatthews commented Mar 25, 2015

It could be interesting to investigate using NLTK (http://www.nltk.org/) here, although that would almost certainly be massive overkill.

@Nurdok

This comment has been minimized.

Copy link
Member

Nurdok commented Mar 25, 2015

@lordmauve

This comment has been minimized.

Copy link
Contributor

lordmauve commented Oct 27, 2015

This could be fixed by listing imperative forms of verbs and comparing with, say, a Porter stemmer (assuming English).

If the stemmed form of the word is in the stemmed list of verbs, then it's probably a verb. But if the unstemmed form is not in the unstemmed list, it's not in the imperative form.

This would have a much lower false positive rate than the current heuristic, but a somewhat larger false negative rate, depending on the size of the word list. But I'm guessing that relatively small lists of verbs would cover a high percentage of docstrings, due to the constrained nature of the language used (outside of domain-specific jargon).

We could generate a reasonably good word list by scanning docstrings of a few largish Python projects and manually removing words that are not imperative verbs.

@lordmauve

This comment has been minimized.

Copy link
Contributor

lordmauve commented Oct 27, 2015

Also it's out of scope to do a full grammar analysis, so we could never detect imperative or not in a docstring like

def median(xs):
    """Given a list of numbers xs, return the median."""

We could only flag up words that are verbs but not in an imperative form.

@lordmauve

This comment has been minimized.

Copy link
Contributor

lordmauve commented Oct 27, 2015

I started scanning my codebase for the verbs we use here. We could also add a blacklist approach - there are words that are very much indicators of non-imperative docstrings.

lordmauve added a commit to lordmauve/pydocstyle that referenced this issue Oct 28, 2015

Incomplete work to improve imperative heuristic
Use a stemmer and word list to check for imperative mood in a number of
common verbs.

Also use a blacklist of words that indicate a phrase not in imperative
mood.

PyCQA#68
@lordmauve

This comment has been minimized.

Copy link
Contributor

lordmauve commented Aug 1, 2018

Can this issue be closed, now #235 is merged?

@Nurdok Nurdok closed this Aug 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment