Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new checker len-as-condition #1154

Merged
merged 2 commits into from Nov 30, 2016

Conversation

@atodorov
Copy link
Contributor

commented Nov 1, 2016

This new checker implements the following statement from PEP8:

For sequences, (strings, lists, tuples), use the fact that empty sequences are false.

Yes: if not seq:
     if seq:

No: if len(seq):
    if not len(seq):

This is still a work in progress mainly around lines 76-79 when we have a Compare statement. ATM I'm not handling statements like 0 < len(S) < 5 or 0 != len(S), etc. Advice is welcome on how to handle this and what sort of error conditions to consider valid.

Background: Recently I've been working with a code base which has lots of if len(something) > 0 or if len(something) != 0 and figured I'll write this patch.

@atodorov atodorov force-pushed the MrSenko:len_checkers branch from bd52740 to 65e04ff Nov 2, 2016

@atodorov

This comment has been minimized.

Copy link
Contributor Author

commented Nov 2, 2016

NOTE: I'm also not sure how to fix the remaining pylint warnings. Need a hint.

@rogalski

This comment has been minimized.

Copy link
Contributor

commented Nov 6, 2016

@atodorov what's unclear with pylint errors? Your new checker simply found these violations in current pylint source. https://github.com/PyCQA/pylint/blob/master/pylint/config.py#L175 is one of errors, I'm sure rest follow the same pattern.

@atodorov atodorov force-pushed the MrSenko:len_checkers branch from 65e04ff to be2f632 Nov 8, 2016

@atodorov

This comment has been minimized.

Copy link
Contributor Author

commented Nov 8, 2016

@rogalski thanks for the hint. I thought the pylint errors were somewhere in the code I wrote and didn't realize the new checker will run against the rest of the project. I've finalized the len_cherker, added some more tests and also updated the rest of the project where the new checker complained.

This PR is ready for review.

@atodorov atodorov force-pushed the MrSenko:len_checkers branch from be2f632 to ba1f6b6 Nov 8, 2016

@atodorov

This comment has been minimized.

Copy link
Contributor Author

commented Nov 16, 2016

ping, any updates on this one ?

@atodorov atodorov force-pushed the MrSenko:len_checkers branch from ba1f6b6 to 1d30a65 Nov 23, 2016

@atodorov

This comment has been minimized.

Copy link
Contributor Author

commented Nov 23, 2016

This is ready to be merged, please review.

@PCManticore

This comment has been minimized.

Copy link
Member

commented Nov 23, 2016

I didn't have time lately for doing a thorough review. In general, it looks good, but I will get back with a more thorough analysis in the following days.

@rowillia
Copy link
Contributor

left a comment

Good stuff! I'm excited to get a chance to use this!

MSGS = {
'W1801': ('Do not use `len(SEQUENCE)` as condition value',
'len-as-condition',
'Used when Pylint detects incorrect use of len(sequence) inside \

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

Here and below: Your don't need an explicit line join here. It's always preferable to rely on implicit line joining when possible (I should write a linter for that 😄)

# `not len(S)` must become `not S` regardless if the parent block
# is a test condition or something else (boolean expression)
# e.g. `if not len(S):`
if isinstance(node, UnaryOp) and node.op == 'not' and \

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

Here and below: Prefer implied line continuations - https://www.python.org/dev/peps/pep-0008/#multiline-if-statements

Also, it'd be great to indent the _is_len_call call. Right now quickly glancing at the code it's easy to confuse whether or not that's part of the body.

        if (isinstance(node, UnaryOp) and node.op == 'not' and
                _is_len_call(node.operand)):
            self.add_message('len-as-condition', node=node)

def _is_len_call(node):
"""checks if @node is len(SOMETHING)"""
return isinstance(node, Call) and isinstance(node.func, Name) and node.func.name == 'len'

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

Can you also ensure node refers to the built-in len?

Unfortunately there's two methods that implement this check in the codebase ☹️

def _is_builtin(node, function):
inferred = utils.safe_infer(node)
if not inferred:
return False
return utils.is_builtin_object(inferred) and inferred.name == function

def _is_builtin(node):
return getattr(node, 'name', None) in ('__builtin__', 'builtins')

pass

if 10 > len('TEST') > 1 > 0:
pass

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

Some other test case ideas:

foo = len('TEST') or 42  # Should be fine

if 0 <= len('TEST') < 100:  # Should be fine
  pass

if z and len(x):  # Should fail ideally, but I realize that would mean having to walk the tree looking for `test` calls.
  pass

a = x and len(x)  # Should be fine

This comment has been minimized.

Copy link
@atodorov

atodorov Nov 23, 2016

Author Contributor

added

# a len(S) call is used inside a test condition
# could be if, while, assert or if expression statement
# e.g. `if len(S):`
if hasattr(node.parent, 'test') and _is_len_call(node):

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

Do you want this to be only for first order len calls or for complex ones as well? e.g. should

if z and len(x):
    pass

Get a warning? If not, we might want to explicitly call that out in docs.

This comment has been minimized.

Copy link
@atodorov

atodorov Nov 23, 2016

Author Contributor

Good catch. I've made a slight modification to detect whether or not the len() call is nested inside BoolOp, which at some point may have a parent with the test attribute.

"""check for incorrect usage of len() in conditions
"""
import itertools
from astroid.node_classes import Call, Const, Name, UnaryOp

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

Elsewhere in this codebase we generally use node_classes directly from astroid (it's __init__.py makes them accessable). For example:

for callfunc in infer_method.nodes_of_class(astroid.Call):

or
https://github.com/PyCQA/pylint/blob/master/pylint/checkers/python3.py#L68

# Licensed under the GPL: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
# For details: https://github.com/PyCQA/pylint/blob/master/COPYING

"""check for incorrect usage of len() in conditions

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

Here and below: Capitalize first letter and add period at the end.

op_3 = ops[ops_idx+2]

# 0 ?? len()
if isinstance(op_1, Const) and (op_1.value == 0) and \

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

Consider pulling this out into a function like _is_constant_zero or something to reduce code duplication.

def visit_compare(self, node):
# compare nodes are trickier because the len(S) expression
# may be somewhere in the middle of the node
ops = [('', node.left)]

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

I don't quite follow this logic. I believe that it's trying to consider 0 < len(x) < 10 by always setting this up as a 3 way comparison, right? Seems to work but I'd love a quick comment as to how it works 😄

This comment has been minimized.

Copy link
@atodorov

atodorov Nov 23, 2016

Author Contributor

Added a comment in the code but basically this is because of the way astroid creates the Compare nodes. The left most operand is in node.left.

if len(S) == 0:
pass

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 23, 2016

Contributor

Consider adding an example of how to correctly handle this and linking to https://www.python.org/dev/peps/pep-0008/#programming-recommendations for our less experienced python developer friends 😄

This comment has been minimized.

Copy link
@atodorov

atodorov Nov 23, 2016

Author Contributor

added

@atodorov atodorov force-pushed the MrSenko:len_checkers branch from 1d30a65 to c309ad8 Nov 23, 2016

@atodorov

This comment has been minimized.

Copy link
Contributor Author

commented Nov 23, 2016

@rowillia thanks for the review. I'm working my way through the comments and will reply where necessary.

@atodorov atodorov force-pushed the MrSenko:len_checkers branch from c309ad8 to b8f9c3b Nov 23, 2016

we detect that a condition uses ``len(SEQUENCE)`` incorrectly. Instead
one could use ``if SEQUENCE`` or ``if not SEQUENCE``.

For instance, all of the examples below are incorrect:

This comment has been minimized.

Copy link
@PCManticore

PCManticore Nov 24, 2016

Member

I would not use incorrect here, rather something along the lines "all the examples below could be written in a more natural way".


"""Check for incorrect usage of len() in conditions."""
import itertools
import astroid

This comment has been minimized.

Copy link
@PCManticore

PCManticore Nov 24, 2016

Member

Separate stdlib from third party imports.

This comment has been minimized.

Copy link
@PCManticore

PCManticore Nov 24, 2016

Member

Not sure I like this being a separate file. We tend to have checkers grouped by a general concept, such as refactoring or checkers related to classes. I would say this belongs into refactoring.py.



def _is_len_call(node):
"""Checks if @node is len(SOMETHING)."""

This comment has been minimized.

Copy link
@PCManticore

PCManticore Nov 24, 2016

Member

Use node rather than @node

@PCManticore

This comment has been minimized.

Copy link
Member

commented Nov 24, 2016

Looks pretty nice. Left a couple of minor comments.

node.func.name == 'len')

def _is_constant_zero(node):
return isinstance(node, astroid.Const) and (node.value == 0)

This comment has been minimized.

Copy link
@PCManticore

PCManticore Nov 24, 2016

Member

You can drop the parens for node.value == 0.

# messages
msgs = MSGS
priority = -2
# configuration options

This comment has been minimized.

Copy link
@PCManticore

PCManticore Nov 24, 2016

Member

These comments are superfluous.


# we're finally out of any nested boolean operations so check if
# the parent is indeed an if, while, assert of if expression statement
if hasattr(parent, 'test'):

This comment has been minimized.

Copy link
@PCManticore

PCManticore Nov 24, 2016

Member

I prefer an explicit instance check, something as in ``isinstance(parent, (astroid.If, ...))```

@atodorov atodorov force-pushed the MrSenko:len_checkers branch from b8f9c3b to 8b33951 Nov 24, 2016

@atodorov

This comment has been minimized.

Copy link
Contributor Author

commented Nov 24, 2016

@PCManticore - thanks for the review, updated and merged with refactoring.py

@@ -203,7 +203,7 @@ def _has_abstract_methods(node):
The methods should be made abstract by decorating them
with `abc` decorators.
"""
return len(utils.unimplemented_abstract_methods(node)) > 0
return bool(utils.unimplemented_abstract_methods(node))

This comment has been minimized.

Copy link
@rogalski

rogalski Nov 24, 2016

Contributor

This is not an improvement, this is a regression. Perfectly intuitive and easy to read code on the left, weird trick on the right.

Checker should be enabled only when truthiness of value is checked. Return statement it's not that case.

return not utils.unimplemented_abstract_methods(node)  # correct
return len(utils.unimplemented_abstract_methods(node)) > 0  # correct
return bool(utils.unimplemented_abstract_methods(node))  # unusual

This comment has been minimized.

Copy link
@atodorov

atodorov Nov 24, 2016

Author Contributor

I've made a small modification to the existing code. It will traverse the parent nodes until it reaches a node like if or while where the trithiness of the value is checked. It should be fine now.

@atodorov atodorov force-pushed the MrSenko:len_checkers branch from 8b33951 to ada0f84 Nov 24, 2016

@rowillia
Copy link
Contributor

left a comment

@atodorov Looks great! Couple more small nits. Most important bit is making sure we get the category correct.

The pypy failure looks unrelated, but kicking this PR with those fixes should help us be sure 😄

if not S:
pass
if S:

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 28, 2016

Contributor

Nit: I would just leave the two examples, no need to repeat them 😄


# configuration section name
name = 'len'
msgs = {'W1801': ('Do not use `len(SEQUENCE)` as condition value',

This comment has been minimized.

Copy link
@rowillia
return isinstance(node, astroid.Const) and node.value == 0

def _node_is_test_condition(node):
# the node is an if, while, assert of if expression statement

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 28, 2016

Contributor

If this is a docstring it should be formatted as such instead of as a comment.


@utils.check_messages('len-as-condition')
def visit_unaryop(self, node):
# `not len(S)` must become `not S` regardless if the parent block

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 28, 2016

Contributor

Convert to a Docstring



class LenChecker(checkers.BaseChecker):
"""Checks for incorrect usage of len() inside conditions. pep8 states:

This comment has been minimized.

Copy link
@rowillia

rowillia Nov 28, 2016

Contributor

Super small nit: Move pep8 states:... to next line to clean up one-line docstring for this method. Also, pep8 should be capitalized.

@atodorov atodorov force-pushed the MrSenko:len_checkers branch from ada0f84 to 2d0a1f2 Nov 29, 2016

@atodorov

This comment has been minimized.

Copy link
Contributor Author

commented Nov 29, 2016

@rowillia updated & rebased

@rowillia

This comment has been minimized.

Copy link
Contributor

commented Nov 30, 2016

@atodorov Great, Thanks! Appveyor failure looks unrelated.

@rowillia rowillia merged commit 4ba1660 into PyCQA:master Nov 30, 2016

2 of 3 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.04%) to 90.626%
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.