Allow no whitespace after unary operator (E225) #9

Closed
florentx opened this Issue Oct 20, 2009 · 9 comments

Comments

Projects
None yet
3 participants

Such code should be considered compliant with PEP 8:

if not -5 < x < +5:
    c = alpha[:-i]

Here is the patch, which uses the tokens instead of iterating on the logical_line string.
I've added some doctests.

--- a/pep8.py      2009-10-20 10:08:22.450102000 +0200
+++ b/pep8.py     2009-10-22 23:04:27.710133690 +0200
 @@ -115,11 +115,12 @@
 WHITESPACE = ' \t'

 # Longer operators (containing others) must come first.
-OPERATORS = """
-!= <> ** //
-+= -= *= /= %= ^= &= |= == >>= <<= <= >=
-+  -  *  /  %  ^  &  |  =  >>  <<  <  >
+BINARY_OPERATORS = """
++= != **= //= /= %= ^= &= |= == >>= <<= <= >=
+-= <>  *= //  /  %  ^  &  |  =  >>  <<  <  >
 """.split()
+UNARY_OPERATORS = ['**', '*', '+', '-']
+OPERATORS = BINARY_OPERATORS + UNARY_OPERATORS

 options = None
 args = None
@@ -406,8 +407,8 @@
             return found, "E224 tab after operator"


-def missing_whitespace_around_operator(logical_line):
-    """
+def missing_whitespace_around_operator(logical_line, tokens):
+    r"""
     - Always surround these binary operators with a single space on
       either side: assignment (=), augmented assignment (+=, -= etc.),
       comparisons (==, <, >, !=, <>, <=, >=, in, not in, is, is not),
@@ -424,45 +425,50 @@
     Okay: baz(**kwargs)
     Okay: negative = -1
     Okay: spam(-1)
+    Okay: alpha[:-i]
+    Okay: if not -5 < x < +5:\n    pass
+    Okay: lambda *args, **kw: (args, kw)

     E225: i=i+1
     E225: submitted +=1
     E225: x = x*2 - 1
     E225: hypot2 = x*x + y*y
     E225: c = (a+b) * (a-b)
+    E225: c = alpha -4
+    E225: z = x **y
     """
-    line = logical_line
     parens = 0
-    pos = 1
-    while pos < len(line):
-        if line[pos] == '(':
+    need_space = False
+    prev_type = tokenize.OP
+    prev_text = prev_end = None
+    for token_type, text, start, end, line in tokens:
+        if token_type in (tokenize.NL, tokenize.NEWLINE):
+            continue
+        if text == '(':
             parens += 1
-        elif line[pos] == ')':
+        elif text == ')':
             parens -= 1
-        for operator in OPERATORS:
-            if line[pos:].startswith(operator):
-                prefix = line[:pos].rstrip()
-                start_argument = prefix.endswith(',') or prefix.endswith('(')
-                ignore = False
-                if operator == '-' and line[pos + 1] in '0123456789':
-                    # Allow unary minus operator: -123.
-                    ignore = True
-                if operator == '=' and parens:
-                    # Allow keyword args or defaults: foo(bar=None).
-                    ignore = True
-                if operator in '**' and parens and start_argument:
+        if need_space:
+            if start == prev_end:
+                return prev_end, "E225 missing whitespace around operator"
+            need_space = False
+        elif token_type == tokenize.OP:
+            if text == '=' and parens:
+                # Allow keyword args or defaults: foo(bar=None).
+                pass
+            elif text in BINARY_OPERATORS:
+                need_space = True
+            elif text in UNARY_OPERATORS:
+                if not (prev_type == tokenize.OP or
+                   (prev_type == tokenize.NAME and iskeyword(prev_text))):
+                    # Allow unary operators: -123, -x, +1.
                     # Allow argument unpacking: foo(*args, **kwargs).
-                    ignore = True
-                if line[pos - 1] != ' ' and not ignore:
-                    return pos, "E225 missing whitespace around operator"
-                pos += len(operator)
-                if pos >= len(line):
-                    break
-                if line[pos] != ' ' and not ignore:
-                    return pos, "E225 missing whitespace around operator"
-                break # Don't consider shorter operators at this position.
-        else:
-            pos += 1
+                    need_space = True
+            if need_space and start == prev_end:
+                return prev_end, "E225 missing whitespace around operator"
+        prev_type = token_type
+        prev_text = text
+        prev_end = end


 def whitespace_around_comma(logical_line):

As a side-effect, I see a big improvement of performance with this modification :

~ $ time bin/pep8.py --repeat /usr/lib/python2.5/*py >/dev/null

real    0m43.316s
user    0m43.043s
sys 0m0.044s
~ $ time bin/mypep8.py --repeat /usr/lib/python2.5/*py >/dev/null

real    0m13.854s
user    0m13.737s
sys 0m0.024s
~ $ 

I've updated the above patch in order to fix the issue reported by Jek: #11
I've added the doctest: "Okay: lambda _args, *_kw: (args, kw)"

It passes all tests: "--doctest" and "--testsuite testsuite".

jek commented Oct 22, 2009

Came across another one, looks like it should go into BINARY_OPERATORS: //=

added.

Owner

cburroughs commented Oct 23, 2009

Github's little issue tracker doesn't seem to note edits. florentx have you edited your patch after jek's comment?

(It it is easier for you I would be happy to accept pull requests instead of patches.)

I installed "hggit" and I pushed the patch in my repo, with additional testcases.
Please review and merge in master.

(Yes I had edited the patch after jek comments.)

Owner

cburroughs commented Oct 24, 2009

This branch is ready for review. Testcases included.
It fixes issues #9 and #10 and improve performance by 60%.

Pull request received will review.

outdated

This issue was closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment