Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Implementing continuation line indentation recommendations from PEP8 #64

Closed
wants to merge 7 commits into from

2 participants

@samv

Hey there,

I have just done some work to write pep8 checks for continuation line parsing. I don't know what your normal procedure is for introducing new checks which may frustrate some people, but I think these are pretty important. It will, for instance, allow pep8.py to accept and reject the very first examples in the PEP8 document marked as "Yes" and "No". I kept every single error as its own code, though perhaps one or two of them could be folded into the same code.

Two thoughts occurred to me towards making it easier to add more checks to pep8.py: one would be to make the checkers "yield" each error they find, allowing a skipped error to not mask a desired error, and secondly some kind of general "make errors more recent than v1.1 warnings" or some such, to allow people who newly encounter the errors to have time to deal with them. I'd be happy to work on such features if it would help acceptance of these new checks.

---8<---
PEP8 has some specific recommendations early on which relate to
indenting continuation lines. Specifically, it distinguishes between
'visual indenting' and 'hanging indents'. This change implements these
rules, adding several new errors (for now):

  • E120 - traps on a visual indent with hanging indent lines following
  • E121 - traps overly–indented lines inside visual indent regions
  • E122 - catches the evil and wrong indenting default that emacs' python-mode applies to 'loose fingernails' (not canonical)
  • E123 - catches continuation lines missing extra indentation when the previous line started a hanging indent
  • E124 - catches continuation lines not aligned on 4 spaces; note this required a fix to the E23.py test
  • E125 - catches continuation lines with 8 or more space indents relative to the previous line without good reason
  • E126 - ensures that indentation leaves blocks visually distinct

All of these errors are directly justifiable from text and examples in
the PEP8 document, and some examples are directly copied from PEP8. The
exception to this is E122, which the PEP8 standard appears neutral on
but nonetheless I feel is intrinsically related with the rules which are
defined here; to not enforce it would permit multiple valid indent
levels for closing brackets which start lines, leading to ambiguity.
All the PEP8 examples leave the closing bracket on the previous line,
avoiding this question. This E122 rule agrees with K&R, KNF, Linux,
this VIM auto–indent script:

http://www.vim.org/scripts/script.php?script_id=974

and the fix to python-mode achieved by the elisp fragment on:

http://stackoverflow.com/revisions/5361478/2

See testsuite/E12.py for many examples.

Sam Vilain Add a hybrid checker for continuation line indentation
PEP8 has some specific recommendations early on which relate to
indenting continuation lines.  Specifically, it distinguishes between
'visual indenting' and 'hanging indents'.  This change implements these
rules, adding several new errors (for now):

 * E120 - traps on a visual indent with hanging indent lines following
 * E121 - traps overly–indented lines inside visual indent regions
 * E122 - catches the evil and wrong indenting default that emacs'
          python-mode applies to 'loose fingernails' (not canonical)
 * E123 - catches continuation lines missing extra indentation when the
          previous line started a hanging indent
 * E124 - catches continuation lines not aligned on 4 spaces; note this
          required a fix to the E23.py test
 * E125 - catches continuation lines with 8 or more space indents
          relative to the previous line without good reason
 * E126 - ensures that indentation leaves blocks visually distinct

All of these errors are directly justifiable from text and examples in
the PEP8 document, and some examples are directly copied from PEP8.  The
exception to this is E122, which the PEP8 standard appears neutral on
but nonetheless I feel is intrinsically related with the rules which are
defined here; to not enforce it would permit multiple valid indent
levels for closing brackets which start lines, leading to ambiguity.
All the PEP8 examples leave the closing bracket on the previous line,
avoiding this question.  This E122 rule agrees with K&R, KNF, Linux,
this VIM auto–indent script:

  http://www.vim.org/scripts/script.php?script_id=974

and the fix to python-mode achieved by the elisp fragment on:

  http://stackoverflow.com/revisions/5361478/2

See testsuite/E12.py for many examples.
771bebd
@florentx
Collaborator

This proposal is interesting.
However it is not an easy task to enforce some rules without being too much coercive.

For example I ran it on the pep8.py source file, and it yielded a bunch of errors.
When I look at some cases, it's not obvious that "fixing" them improve readability.

$ python pep8.py --show-source --select E12 pep8.py 

pep8.py:136:5: E120 continuation line insufficiently indentated for visual indent; hanging indents should have no arguments on the first line
    '%=', '^=', '&=', '|=', '==', '/=', '//=', '<=', '>=', '<<=', '>>=',
    ^
pep8.py:509:33: E123 continuation line missing indent or outdented
                                over_indent == 4 and indent_next):
                                ^
pep8.py:518:37: E123 continuation line missing indent or outdented
                                    start, 'E122 lines starting with a '
                                    ^
pep8.py:543:21: E123 continuation line missing indent or outdented
                    d=depth,
                    ^
pep8.py:553:21: E123 continuation line missing indent or outdented
                    d=depth,
                    ^
pep8.py:599:13: E126 statement with indented block ends with continuation line indented by 4 spaces
            (not keyword.iskeyword(prev_text))):
            ^
pep8.py:782:28: E121 continuation line over-indented for visual indent
                           or not text.startswith('# ')):
                           ^
pep8.py:839:13: E126 statement with indented block ends with continuation line indented by 4 spaces
            not LAMBDA_REGEX.search(before)):           # lambda x: x
            ^
pep8.py:1150:21: E120 continuation line insufficiently indentated for visual indent; hanging indents should have no arguments on the first line
                    (token[2][0], pos, tokenize.tok_name[token[0]], token[1]))
                    ^
pep8.py:1173:25: E120 continuation line insufficiently indentated for visual indent; hanging indents should have no arguments on the first line
                        self.blank_lines_before_comment)
                        ^
pep8.py:1461:25: E121 continuation line over-indented for visual indent
                        "comma separated patterns (default: %s)" %
                        ^
pep8.py:1465:25: E121 continuation line over-indented for visual indent
                        "matching these comma separated patterns (default: "
                        ^
pep8.py:1479:25: E121 continuation line over-indented for visual indent
                        "to standard error and set exit code to 1 if "
                        ^
Sam Vilain added some commits
Sam Vilain Fix continuation lines false positive on indented blocks
A block which started at an initial indent followed by a hanging indent
would not work due to an array being badly primed.  Fix it, and while
we're fixing small things, remove some breakpoint/debugging code and an
overly long line.
c98b761
Sam Vilain E12: eat your own dogfood: visual or hanging indent, pick one
The ultimate goal and test of any automatic PEP8 test is that of
readability: the check should detect a large number of un–readable, or at
least potentially confusing blocks, and when it fails there should be an
obvious alternate form to switch to which is more readable, or at least,
no less readable, and not awkward.  That way, the net effect of the
stringency is positive, and it is more likely that seasoned developers
whose opinion on code readability is the most valuable will get behind it.

Here, the new E12 test is applied to the pep8.py file itself, and each
modification discussed in more detail in order of incidence.

In the first hunk, a long list of string tokens follows an opening bracket;
and this is one situation where this form is actually quite readable.  This
form can easily be translated into hanging indent by just moving the tokens
onto the next line.  This is a relatively neutral transform; both the
before and after are similarly readable.  The code will accept the closing
brackets at the end of the list, or flush left on the following line, which
I decided looked better for this block.

Next is an if block with parentheses: the visual indented block
unfortunately coincides with the following block (because len("if (") == 4)
and so the following block is not visually distinct; so this change
improves readability by making the "return" statement stand out.

Next up there's an if statement where the continuation line has been
manually indented such that two repeated set of symbols line up and the
difference to their argument value is apparent.  A page is taken from the
GNU style guidelines and an extra pair of parentheses added to permit the
indented visual indent.  This also satisfies another PEP8 recommendation,
which is to put operators before a line break rather than after.  However,
overall this change is quite neutral.

Later in the compound_statements() check there is another if statement
which has E126 again, but also wants to align the tokens on two of the sub–
expressions.  The last line of the statement has plenty of room to move,
though, and gets the permitted extra 4–space indent to separate it visually
from the previous line, improving readability.  The first new test in
E12.py contains two suggested layouts if such a line is not available.

Next up there is a print statement which uses the % formatting operator.
Because this operator typically starts with a long token, and then wants a
list, it can tend to lead to awkward visual indents on a narrow margin.  In
this case the long list of variables was only 2 characters too long, and so
the author outdented it slightly.  As shown, switching to a pure hanging
indent form gets up out of the tight spot (and usually does for problems
like this).  A better solution for the % operator is to use .format(),
which offers vastly improved readability, as demonstrated in the second new
test in E12.py.
0bfe264
@samv

Hi, see the commit message in the "eat your own dogfood" change where I discuss each of these problems in detail, and try to comment on the effects on the readability of the code. Also there was one bug that I fixed which was half the exceptions above. This particular PEP8 recommendation is one that I'm familiar with satisfying (not least, because emacs uses a similar definition of "visual indent", and getting it right permits auto–indent to keep working, while still avoiding that 79–column margin). It's also in my opinion the one PEP8 recommendation that would most drastically improve the dozens of python changes which hit my gerrit queue every day.

Sam Vilain added some commits
Sam Vilain Fix an E120 false positive
If there was a bracketed expression closed on the same line
inside a visual indent, the code expected indenting to continue
at the indent level of the last bracketed sub–expression.  To
fix, update the current visual indent level as they are popped
off the stack.
9e54fd1
Sam Vilain Extend E12* errors to unbracketed continuation lines
If you make a continuation line with a backslash, subject it
to the same rules of how it may be continued.  Of course, PEP8
does say that the backslash is a less preferable form of multi-
line splitting than using brackets.
9705997
@florentx
Collaborator

Here I had a symmetry between line 136 and 137 which help checking that the set is complete.

Actually, I've many cases in my team's code where I have similar continuation which don't start on its own line, but follow directly the opening bracket (, [ or {.
We should probably not activate this particular check with the default configuration (probably I'll add it to the DEFAULT_IGNORE variable).

Ah, I didn't notice that. How about:

BINARY_OPERATORS = frozenset([
    '**=', '*=', '+=', '-=', '!=', '<>',
    '%=', '^=', '&=', '|=', '==', '/=', '//=', '<=', '>=', '<<=', '>>=',
    '%',  '^',  '&',  '|',  '=',  '/',  '//',  '<',  '>',  '<<',
])

Is '>>' intentionally missing?

Collaborator

The '>>' is classified in the UNARY_OPERATORS, because of the Python 2 syntax print >>outfile, "Hello"

@florentx
Collaborator

Unfortunately, this change breaks Python 3 compatibility.
But it's easy to restore it with another pair of parenthesis.

To print(), I presume. In which case, keeping with the rule that there should be no tokens after the opening parens,

print(
    'l.%s\t%s\t%s\t%r' % (
        token[2][0], pos, tokenize.tok_name[token[0]],
        token[1]
    )
)

Not the best, is it. And it seems too common a case to want to force the user to turn it into two tidier statements.

What does work (IMHO) is to visually indent the argument list to print, and hang the argument list to format:

print('l.{line}\t{pos}\t{name}\t{text}'.format(
          line=token[2][0],
          pos=pos,
          name=tokenize.tok_name[token[0]],
          text=repr(token[1]),
      ))

The '.format' part could be moved to the next line:

print('l.{line}\t{pos}\t{name}\t{text}'
      .format(
          line=token[2][0],
          pos=pos,
          name=tokenize.tok_name[token[0]],
          text=repr(token[1]),
      ))

or just visually indent both:

print('l.{line}\t{pos}\t{name}\t{text}'
      .format(line=token[2][0],
              pos=pos,
              name=tokenize.tok_name[token[0]],
              text=repr(token[1])))

Even:

print('l.{line}\t{pos}\t{name}\t{text}'
      .format(line=token[2][0], pos=pos,
              name=tokenize.tok_name[token[0]],
              text=repr(token[1])))
Collaborator

The other problem is that the str.format method does not exist with Python 2.5.

And I think that the choice between % and str.format should not be done on the criteria of PEP8.

Sure. Well, there's always the pure visual indent version of that, too:

print('l.%s\t%s\t%s\t%r' %
      (token[2][0], pos, tokenize.tok_name[token[0]],
       token[1]))

One thing I didn't consider was making switching from a visual to a hanging indent inside a visual indent OK. I could permit that and make this work:

            print('l.%s\t%s\t%s\t%r' % (
                      token[2][0], pos, tokenize.tok_name[token[0]],
                      token[1]
                  ))

In this case, the hanging indent ends up not aligned on a 4-space boundary from the left, but instead at a 4-space boundary from its visual indent level. But that's OK, because we're not using hard tabs.

@florentx
Collaborator

this line was indented with a multiple of 4. I don't see an obvious PEP8 violation.

"When using a hanging indent ... there should be no arguments on the first line ..."

The indent is hanging from the first opening parens, but there are tokens immediately afterwards. It later states, "Arguments on first line forbidden when not using vertical alignment".

@florentx
Collaborator

Thank you for the fixes.
I'll try to run this on some pretty bad Python code base (like the standard library ;) ) and see if I identify more issues.
Then if we decide to merge it, we'll probably put the checks in the DEFAULT_IGNORE variable to be less obtrusive.

For the proposal to "yield" instead of "return" errors, it deserves probably its own issue.

Sam Vilain added some commits
Sam Vilain New continuation line error E127 unnecessary backslash
The continuation backslash is only required if there is not an open
bracket enclosing the end of line.  Make it an error to have it there
anyway.  This broadens the continuation line indenting test to test more
than just indenting, but it is a convenient place to put the test.
2c67b37
Sam Vilain Allow tokens following multi–line string literals to avoid E12*
Multi-line literal strings are a little bit of an outsider when it comes
to continuation lines.  If such a token is used, consider the code to be
OK so long as the next token starts either immediately, or at the same
place you'd expect tokens to start again if the multi–line literal
wasn't there.  This is a corner case, none of the examples are
particularly readable, but it shows what does work.
0cc907e
@samv

I just ran pep8 on the Lib/ directory of python 3.3.0a3:

Herman:~/src/Python-3.3.0a3/Lib$ ~/src/pep8/pep8.py --select=E12 . | wc -l
4316
Herman:~/src/Python-3.3.0a3/Lib$ ~/src/pep8/pep8.py --ignore=E12,E24 . | wc -l
30042

So, it looks like it adds about 14% to the backlog of code clean–ups :-)

@florentx
Collaborator

Hard to be compatible with Python 2.5, with Python 3 and with PEP8 ....

IMHO, something like this should be acceptable:

print('l.%s\t%s\t%s\t%r' % (
    token[2][0], pos, tokenize.tok_name[token[0]],
    token[1],
))
@florentx
Collaborator

I did additional tests with some code base I work with.

  • E122 "lines starting with a closing bracket should be indented to match that of the opening bracket's line" --> OK (message is a bit long)
  • E124 "continuation line not indented on a 4-space boundary" --> OK
  • E125 "continuation line over-indented" --> we have some code where the indentation between brackets is 8-chars
  • E126 "statement with indented block ends with continuation line indented by 4 spaces" --> we have some cases like that because of len('if (') == 4.
  • E127 --> it becomes duplicate of E502 which I've implemented recently

Then I have few cases more problematic with E120, E121 and E123.
The snippets do not look bad, and if I change it, it might lose some readability.

  • E120 "continuation line insufficiently indentated for visual indent; hanging indents should have no arguments on the first line"
#: E120
rv = {'sum': 42}
rv.update(dict.fromkeys((
    'qualif_nr', 'reasonComment_en', 'reasonComment_fr',
    'reasonComment_de', 'reasonComment_it'), '?'))
#: E120
event_obj.write(cursor, user_id, {
    'user': user,
    'summary': text,
    'data': data,
})
#: E120


def qualify_by_address(self, cr, uid, ids, context=None,
        params_to_check=frozenset(QUALIF_BY_ADDRESS_PARAM)):
    """ This gets called by the web server """
  • E121 "continuation line over-indented for visual indent"
#: E121
def unicode2html(s):
    """Convert the characters &<>'" in string s to HTML-safe sequences.
    Convert newline to <br> too."""
    return unicode((s or '').replace('&', '&amp;')
                            .replace('>', '&gt;')
                            .replace('<', '&lt;')
                            .replace("'", '&#39;')
                            .replace('"', '&#34;')
                            .replace('\n', '<br>\n'))
#: E121
_ipv4_re = re.compile('^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.'
                       '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.'
                       '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.'
                       '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$')
#: E121
# Why add () around the string? it does not make it more readable.
parser.add_option('--count', action='store_true',
                  help="print total number of errors and warnings "
                       "to standard error and set exit code to 1 if "
                       "total is not null")
#: E121
fct("""
    AAA """ + status_2_string)
  • E123 "continuation line missing indent or outdented"
#: E123
if context:
    msg = """\
action: GET-CONFIG
payload:
    ip_address: "%(ip)s"
    username: "%(username)s"
""" % context
@samv

What about this?

print('l.%s\t%s\t%s\t%r' % (
    token[2][0], pos, tokenize.tok_name[token[0]],
    token[1]
), file=some_really_really_long_symbol_name.fh,
    sep="<*=-x-=*>")

It's not obvious what the correct indent level is for the rest of the list outside of that hanging indent. The new version I gave above (with the visual + hanging) is less ambiguous, but I'd consider it reasonable to add a special case when all of the stacked brackets are immediately closed.

Actually this behavior already exists with my code, because I don't consider an opening bracket to start a visual indent; this is valid:

foo = my.func({
    "foo": "bar",
}, "baz")

Again you end up outdented beyond the logical indent. These cases should be handled consistently I think.

@samv

OK, I've looked at the various pieces of code you have shown that it errors on, and in a sense these are false positives and there is probably much room to tinker with it. What about the other side of the coin—did you find it identify many poorly indented lines?

@florentx
Collaborator

The checker worked good on the codebase where I ran it.
It found a bunch of cases, the more controversial I've listed in my previous message.
I already fixed the other cases where the alignment was not right.

There were some cases with 8-chars indent inside brackets (E125).
I am balanced about lowering this indent. On one side 8-chars helps to make the difference between block indent and continuation indent, on the other side 4-chars do not waste whitespace and helps to respect the line-length limit.
On new projects, I always enforce 4-chars, though.

~$ ../pep8_samv/pep8.py -qq --statistics project/
164     E120 continuation line insufficiently indentated for visual indent; hanging indents should have no arguments on the first line
7       E121 continuation line over-indented for visual indent
15      E122 lines starting with a closing bracket should be indented to match that of the opening bracket's line
4       E123 continuation line missing indent or outdented
13      E124 continuation line not indented on a 4-space boundary
87      E125 continuation line over-indented
8       E126 statement with indented block ends with continuation line indented by 4 spaces

I plan to merge this patch soon or later, with some adaptations:

  • the checks will be disabled by default
  • the messages will be shortened a little if possible
  • the code will be adapted for Python 2.5 through 3.2

Something else I don't like, it is the parenthesis after the statements:

if(True):
    assert(True)
    return(True)

It seems it's not forbidden by PEP8, but If you put the parenthesis just after the statement, without space, it looks like a function. It is confusing for some developers.

@florentx florentx closed this in fcc6c11
@florentx
Collaborator

I took the decision to merge this patch.
The checks are disabled in the default configuration.

There's still room for improvements. You can propose new pull requests.

Adaptations:

  • reorder the checks because we usually do not use codes ending with "0" (implicit convention)
  • commented the debug statements which were not compatible Python 2.5 / Python 3.2
  • switched to yield instead of return for upstream compat.
  • removed the check for redundant backslash (duplicate of E502)
@florentx
Collaborator

and thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on May 27, 2012
  1. Add a hybrid checker for continuation line indentation

    Sam Vilain authored
    PEP8 has some specific recommendations early on which relate to
    indenting continuation lines.  Specifically, it distinguishes between
    'visual indenting' and 'hanging indents'.  This change implements these
    rules, adding several new errors (for now):
    
     * E120 - traps on a visual indent with hanging indent lines following
     * E121 - traps overly–indented lines inside visual indent regions
     * E122 - catches the evil and wrong indenting default that emacs'
              python-mode applies to 'loose fingernails' (not canonical)
     * E123 - catches continuation lines missing extra indentation when the
              previous line started a hanging indent
     * E124 - catches continuation lines not aligned on 4 spaces; note this
              required a fix to the E23.py test
     * E125 - catches continuation lines with 8 or more space indents
              relative to the previous line without good reason
     * E126 - ensures that indentation leaves blocks visually distinct
    
    All of these errors are directly justifiable from text and examples in
    the PEP8 document, and some examples are directly copied from PEP8.  The
    exception to this is E122, which the PEP8 standard appears neutral on
    but nonetheless I feel is intrinsically related with the rules which are
    defined here; to not enforce it would permit multiple valid indent
    levels for closing brackets which start lines, leading to ambiguity.
    All the PEP8 examples leave the closing bracket on the previous line,
    avoiding this question.  This E122 rule agrees with K&R, KNF, Linux,
    this VIM auto–indent script:
    
      http://www.vim.org/scripts/script.php?script_id=974
    
    and the fix to python-mode achieved by the elisp fragment on:
    
      http://stackoverflow.com/revisions/5361478/2
    
    See testsuite/E12.py for many examples.
  2. Fix continuation lines false positive on indented blocks

    Sam Vilain authored
    A block which started at an initial indent followed by a hanging indent
    would not work due to an array being badly primed.  Fix it, and while
    we're fixing small things, remove some breakpoint/debugging code and an
    overly long line.
  3. E12: eat your own dogfood: visual or hanging indent, pick one

    Sam Vilain authored
    The ultimate goal and test of any automatic PEP8 test is that of
    readability: the check should detect a large number of un–readable, or at
    least potentially confusing blocks, and when it fails there should be an
    obvious alternate form to switch to which is more readable, or at least,
    no less readable, and not awkward.  That way, the net effect of the
    stringency is positive, and it is more likely that seasoned developers
    whose opinion on code readability is the most valuable will get behind it.
    
    Here, the new E12 test is applied to the pep8.py file itself, and each
    modification discussed in more detail in order of incidence.
    
    In the first hunk, a long list of string tokens follows an opening bracket;
    and this is one situation where this form is actually quite readable.  This
    form can easily be translated into hanging indent by just moving the tokens
    onto the next line.  This is a relatively neutral transform; both the
    before and after are similarly readable.  The code will accept the closing
    brackets at the end of the list, or flush left on the following line, which
    I decided looked better for this block.
    
    Next is an if block with parentheses: the visual indented block
    unfortunately coincides with the following block (because len("if (") == 4)
    and so the following block is not visually distinct; so this change
    improves readability by making the "return" statement stand out.
    
    Next up there's an if statement where the continuation line has been
    manually indented such that two repeated set of symbols line up and the
    difference to their argument value is apparent.  A page is taken from the
    GNU style guidelines and an extra pair of parentheses added to permit the
    indented visual indent.  This also satisfies another PEP8 recommendation,
    which is to put operators before a line break rather than after.  However,
    overall this change is quite neutral.
    
    Later in the compound_statements() check there is another if statement
    which has E126 again, but also wants to align the tokens on two of the sub–
    expressions.  The last line of the statement has plenty of room to move,
    though, and gets the permitted extra 4–space indent to separate it visually
    from the previous line, improving readability.  The first new test in
    E12.py contains two suggested layouts if such a line is not available.
    
    Next up there is a print statement which uses the % formatting operator.
    Because this operator typically starts with a long token, and then wants a
    list, it can tend to lead to awkward visual indents on a narrow margin.  In
    this case the long list of variables was only 2 characters too long, and so
    the author outdented it slightly.  As shown, switching to a pure hanging
    indent form gets up out of the tight spot (and usually does for problems
    like this).  A better solution for the % operator is to use .format(),
    which offers vastly improved readability, as demonstrated in the second new
    test in E12.py.
  4. Fix an E120 false positive

    Sam Vilain authored
    If there was a bracketed expression closed on the same line
    inside a visual indent, the code expected indenting to continue
    at the indent level of the last bracketed sub–expression.  To
    fix, update the current visual indent level as they are popped
    off the stack.
  5. Extend E12* errors to unbracketed continuation lines

    Sam Vilain authored
    If you make a continuation line with a backslash, subject it
    to the same rules of how it may be continued.  Of course, PEP8
    does say that the backslash is a less preferable form of multi-
    line splitting than using brackets.
  6. New continuation line error E127 unnecessary backslash

    Sam Vilain authored
    The continuation backslash is only required if there is not an open
    bracket enclosing the end of line.  Make it an error to have it there
    anyway.  This broadens the continuation line indenting test to test more
    than just indenting, but it is a convenient place to put the test.
  7. Allow tokens following multi–line string literals to avoid E12*

    Sam Vilain authored
    Multi-line literal strings are a little bit of an outsider when it comes
    to continuation lines.  If such a token is used, consider the code to be
    OK so long as the next token starts either immediately, or at the same
    place you'd expect tokens to start again if the multi–line literal
    wasn't there.  This is a corner case, none of the examples are
    particularly readable, but it shows what does work.
This page is out of date. Refresh to see the latest.
Showing with 415 additions and 27 deletions.
  1. +205 −25 pep8.py
  2. +208 −0 testsuite/E12.py
  3. +2 −2 testsuite/E23.py
View
230 pep8.py
@@ -132,9 +132,11 @@ def blank_lines(logical_line, blank_lines, indent_level, line_number)
WHITESPACE = ' \t'
-BINARY_OPERATORS = frozenset(['**=', '*=', '+=', '-=', '!=', '<>',
- '%=', '^=', '&=', '|=', '==', '/=', '//=', '<=', '>=', '<<=', '>>=',
- '%', '^', '&', '|', '=', '/', '//', '<', '>', '<<'])
+BINARY_OPERATORS = frozenset([
+ '**=', '*=', '+=', '-=', '!=', '<>', '%=', '^=', '&=', '|=', '==',
+ '/=', '//=', '<=', '>=', '<<=', '>>=', '%', '^', '&', '|', '=',
+ '/', '//', '<', '>', '<<',
+])
UNARY_OPERATORS = frozenset(['>>', '**', '*', '+', '-'])
OPERATORS = BINARY_OPERATORS | UNARY_OPERATORS
SKIP_TOKENS = frozenset([tokenize.COMMENT, tokenize.NL, tokenize.INDENT,
@@ -415,6 +417,181 @@ def indentation(logical_line, previous_logical, indent_char,
return 0, "E113 unexpected indentation"
+def continuation_line_indentation(logical_line, tokens, indent_level):
+ """
+ Continuation lines should align wrapped elements either vertically using
+ Python's implicit line joining inside parentheses, brackets and braces, or
+ using a hanging indent.
+
+ When using a hanging indent the following considerations should be applied;
+
+ - there should be no arguments on the first line, and
+
+ - further indentation should be used to clearly distinguish itself as a
+ continuation line.
+
+ """
+ first_line = tokens[0][2][0]
+ last_line = tokens[-1][3][0]
+ if first_line == last_line:
+ return
+
+ # indent_next tells us whether the next block is indented; assuming
+ # that it is indented by 4 spaces, then we should not allow 4-space
+ # indents on the final continuation line; in turn, some other
+ # indents are allowed to have an extra 4 spaces.
+ indent_next = logical_line.endswith(':')
+
+ depth = 0
+
+ parens = [] # for looking back to see where a bracket was opened
+ max_physical_line = 0
+ visual_min = [None] # visual indent columns by indent depth
+ rel_indent = [[0, 0]] # relative indents of physical lines
+ last_indent = None
+ last_backslash = []
+ if options.verbose >= 3:
+ print ">>> " + tokens[0][4],
+
+ for token in tokens:
+ token_type, text, start, end, orig_line = token
+ line = start[0] - first_line
+ while len(parens) - 1 < line:
+ parens.append([])
+
+ if line > 0:
+ if max_physical_line < line and not token_type in (
+ tokenize.NEWLINE, tokenize.NL):
+ if depth > 0 and len(last_backslash) >= line and \
+ last_backslash[line - 1] is not None:
+ return(last_backslash[line - 1], "E127 unnecessary "
+ "continuation backslash")
+ # this is the beginning of a continuation line.
+ max_physical_line = line
+ last_indent = start
+ if options.verbose >= 3:
+ print "... " + orig_line,
+
+ # record the initial indent. if lines were missing (eg,
+ # multi-line strings) then pad the list out
+ while len(rel_indent) - 1 < line:
+ rel_indent.append(None)
+ rel_indent[line] = [depth, start[1] - indent_level]
+
+ if depth > 0:
+ # a bracket expression in a continuation line.
+
+ # first, find the line that it was opened on
+ open_line = None
+ for open_line in range(line - 1, -1, -1):
+ if len(parens[open_line]) > 0:
+ break
+ start_col, end_col = parens[open_line][-1]
+ hang = rel_indent[line][1] - rel_indent[open_line][1]
+ else:
+ # an unbracketed continuation line (ie, backslash)
+ start_col = 0
+ hang = rel_indent[line][1]
+
+ # check to see if visual indenting is active
+ min_indent = visual_min[depth]
+ if min_indent is not None:
+ if start[1] < min_indent:
+ return(start, 'E120 continuation line '
+ 'insufficiently indentated for visual '
+ 'indent; hanging indents should have no '
+ 'arguments on the first line')
+
+ # for E122
+ eff_depth = depth
+ if token_type == tokenize.OP and text in ']})':
+ eff_depth -= 1
+
+ # check that this line is either visually indented vs
+ # the opening parens, or a hanging indent.
+ if start[1] == last_token_multiline:
+ # continuing right after a multiline string is OK
+ pass
+ elif start[1] == start_col:
+ # visual. fine.
+ pass
+ elif min_indent is not None:
+ over_indent = start[1] - end_col
+ if start[1] > end_col and not (
+ over_indent == 4 and indent_next):
+ return(start, "E121 continuation line over-"
+ "indented for visual indent")
+ else:
+ # hanging indent.
+ if eff_depth != depth:
+ if hang != 0:
+ return(
+ start, 'E122 lines starting with a '
+ 'closing bracket should be indented '
+ "to match that of the opening "
+ "bracket's line"
+ )
+ elif (hang <= 0):
+ return(start, 'E123 continuation line '
+ 'missing indent or outdented')
+ elif (hang % 4):
+ return(start, 'E124 continuation line not '
+ 'indented on a 4-space boundary')
+ elif (hang > 4) and not (hang == 8 and indent_next):
+ return(start, "E125 continuation line over-"
+ "indented")
+
+ # keep track of bracket depth and look for visual indenting
+ if token_type == tokenize.OP and text in '([{':
+ visual_min.append(None)
+ depth += 1
+ assert(len(visual_min) == depth + 1)
+ if depth > 0:
+ visual_min[depth] = visual_min[depth - 1]
+ parens[line].append([start[1], end[1]])
+ if options.verbose >= 4:
+ print "bracket depth {d} seen, col {c}, visual min = {m}"\
+ .format(
+ d=depth,
+ c=start[1],
+ m=visual_min[depth],
+ )
+ elif len(parens[line]) > 0 and token_type != tokenize.NL:
+ # text after an open parens starts visual indenting
+ if visual_min[depth] is None:
+ visual_min[depth] = start[1]
+ if options.verbose >= 4:
+ print "bracket depth {d} indent to {c}".format(
+ d=depth,
+ c=start[1],
+ )
+
+ if token_type == tokenize.OP and text in ')]}':
+ depth -= 1
+ visual_min.pop()
+ for open_line in range(line, -1, -1):
+ if len(parens[open_line]) > 0:
+ parens[open_line].pop()
+ break
+ if len(parens[open_line]):
+ visual_min[depth] = parens[open_line][-1][1]
+
+ # exceptions that need to processed at the end of lines
+ if len(last_backslash) - 1 < line:
+ while len(last_backslash) < line:
+ last_backslash.append(None)
+ last_backslash.append(
+ (end[0], len(orig_line) - 2) if orig_line.endswith('\\\n')
+ else None
+ )
+
+ last_token_multiline = (end[1] if start[0] != end[0] else None)
+
+ if indent_next and rel_indent[-1][1] == 4:
+ return(last_indent, "E126 statement with indented block ends "
+ "with continuation line indented by 4 spaces")
+
+
def whitespace_before_parameters(logical_line, tokens):
"""
Avoid extraneous whitespace in the following situations:
@@ -438,13 +615,13 @@ def whitespace_before_parameters(logical_line, tokens):
for index in range(1, len(tokens)):
token_type, text, start, end, line = tokens[index]
if (token_type == tokenize.OP and
- text in '([' and
- start != prev_end and
- (prev_type == tokenize.NAME or prev_text in '}])') and
- # Syntax "class A (B):" is allowed, but avoid it
- (index < 2 or tokens[index - 2][1] != 'class') and
- # Allow "return (a.foo for a in range(5))"
- (not keyword.iskeyword(prev_text))):
+ text in '([' and
+ start != prev_end and
+ (prev_type == tokenize.NAME or prev_text in '}])') and
+ # Syntax "class A (B):" is allowed, but avoid it
+ (index < 2 or tokens[index - 2][1] != 'class') and
+ # Allow "return (a.foo for a in range(5))"
+ (not keyword.iskeyword(prev_text))):
return prev_end, "E211 whitespace before '%s'" % text
prev_type = token_type
prev_text = text
@@ -626,8 +803,8 @@ def whitespace_before_inline_comment(logical_line, tokens):
if prev_end[0] == start[0] and start[1] < prev_end[1] + 2:
return (prev_end,
"E261 at least two spaces before inline comment")
- if (len(text) > 1 and text.startswith('# ')
- or not text.startswith('# ')):
+ if (len(text) > 1 and (text.startswith('# ') or not
+ text.startswith('# '))):
return start, "E262 inline comment should start with '# '"
else:
prev_end = end
@@ -684,7 +861,7 @@ def compound_statements(logical_line):
before = line[:found]
if (before.count('{') <= before.count('}') and # {'a': 1} (dict)
before.count('[') <= before.count(']') and # [1:2] (slice)
- not LAMBDA_REGEX.search(before)): # lambda x: x
+ not LAMBDA_REGEX.search(before)): # lambda x: x
return found, "E701 multiple statements on one line (colon)"
found = line.find(';')
if -1 < found:
@@ -994,8 +1171,10 @@ def check_all(self, expected=None, line_offset=0):
pos = '[%s:%s]' % (token[2][1] or '', token[3][1])
else:
pos = 'l.%s' % token[3][0]
- print('l.%s\t%s\t%s\t%r' %
- (token[2][0], pos, tokenize.tok_name[token[0]], token[1]))
+ print 'l.%s\t%s\t%s\t%r' % (
+ token[2][0], pos, tokenize.tok_name[token[0]],
+ token[1]
+ )
self.tokens.append(token)
token_type, text = token[0:2]
if token_type == tokenize.OP:
@@ -1017,7 +1196,8 @@ def check_all(self, expected=None, line_offset=0):
source_line = token[4]
token_start = token[2][1]
if source_line[:token_start].strip() == '':
- self.blank_lines_before_comment = max(self.blank_lines,
+ self.blank_lines_before_comment = max(
+ self.blank_lines,
self.blank_lines_before_comment)
self.blank_lines = 0
if text.endswith('\n') and not parens:
@@ -1305,13 +1485,13 @@ def process_options(arglist=None):
parser.add_option('--first', action='store_false', dest='repeat',
help="show first occurrence of each error")
parser.add_option('--exclude', metavar='patterns', default=DEFAULT_EXCLUDE,
- help="exclude files or directories which match these "
- "comma separated patterns (default: %s)" %
- DEFAULT_EXCLUDE)
+ help=("exclude files or directories which match these "
+ "comma separated patterns (default: %s)" %
+ DEFAULT_EXCLUDE))
parser.add_option('--filename', metavar='patterns', default='*.py',
- help="when parsing directories, only check filenames "
- "matching these comma separated patterns (default: "
- "*.py)")
+ help=("when parsing directories, only check filenames "
+ "matching these comma separated patterns "
+ "(default: *.py)"))
parser.add_option('--select', metavar='errors', default='',
help="select errors and warnings (e.g. E,W6)")
parser.add_option('--ignore', metavar='errors', default='',
@@ -1323,9 +1503,9 @@ def process_options(arglist=None):
parser.add_option('--statistics', action='store_true',
help="count errors and warnings")
parser.add_option('--count', action='store_true',
- help="print total number of errors and warnings "
- "to standard error and set exit code to 1 if "
- "total is not null")
+ help=("print total number of errors and warnings "
+ "to standard error and set exit code to 1 if "
+ "total is not null"))
parser.add_option('--benchmark', action='store_true',
help="measure processing speed")
parser.add_option('--testsuite', metavar='dir',
View
208 testsuite/E12.py
@@ -0,0 +1,208 @@
+#: E120
+print "E120", ("visual",
+ "hanging")
+#: E120
+print "E120", ("under-",
+ "under-indent")
+#: E121
+print "E121", ("over-",
+ "over-indent")
+#: E123
+print "E123", (
+"dent")
+#: E124
+print "E124", (
+ "dent")
+#: E125
+print "E125", (
+ "dent")
+#: E125
+print "E125", (
+ "dent")
+#: Okay
+if (foo == bar and
+ baz == frop):
+ pass
+#: Okay
+if (
+ foo == bar and
+ baz == frop
+):
+ pass
+#: Okay
+if start[1] > end_col and not (
+ over_indent == 4 and indent_next):
+ return(0, "E115 continuation line over-"
+ "indented for visual indent")
+#: Okay
+print "OK", ("visual",
+ "indent")
+#: E120
+# Arguments on first line forbidden when not using vertical alignment
+foo = long_function_name(var_one, var_two,
+ var_three, var_four)
+#: Okay
+# Aligned with opening delimiter
+foo = long_function_name(var_one, var_two,
+ var_three, var_four)
+#: Okay
+# Extra indentation is not necessary.
+foo = long_function_name(
+ var_one, var_two,
+ var_three, var_four)
+#: E120
+print "Okay", ("visual",
+ "indent_two"
+ )
+#: Okay
+print "Okay", ("visual",
+ "indent_three"
+ )
+#: E120
+print "E120", ("visual",
+ "indent_five"
+)
+#: E122 W291
+print "E122", (
+ "bad", "hanging", "close"
+ )
+#: Okay
+print "a-ok", (
+ "there",
+ "dude",
+)
+#: Okay
+print "hello", (
+ "there",
+ "dude")
+#: Okay
+print "hello", (
+ "there", "dude")
+#: Okay
+print "hello", (
+ "there", "dude",
+)
+#: E126
+# Further indentation required as indentation is not distinguishable
+
+
+def long_function_name(
+ var_one, var_two, var_three,
+ var_four):
+ print(var_one)
+#: Okay
+
+
+def long_function_name(
+ var_one, var_two, var_three,
+ var_four):
+ print(var_one)
+#: E124
+result = {
+ 'key1': 'value',
+ 'key2': 'value',
+}
+#: E122
+result = {
+ 'foo': [
+ 'bar', {
+ 'baz': 'frop',
+ }
+ ]
+ }
+#: Okay
+result = {
+ 'foo': [
+ 'bar', {
+ 'baz': 'frop',
+ }
+ ]
+}
+#: Okay
+if bar:
+ return(
+ start, 'E122 lines starting with a '
+ 'closing bracket should be indented '
+ "to match that of the opening "
+ "bracket's line"
+ )
+#: Okay
+# you want vertical alignment, so use a parens
+if ((foo.bar("baz") and
+ foo.bar("frop")
+ )):
+ print "yes"
+
+# also ok, but starting to look like LISP
+if ((foo.bar("baz") and
+ foo.bar("frop"))):
+ print "yes"
+#: Okay
+# print('l.%s\t%s\t%s\t%r' %
+# (token[2][0], pos, tokenize.tok_name[token[0]], token[1]))
+print 'l.{line}\t{pos}\t{name}\t{text}'.format(
+ line=token[2][0],
+ pos=pos,
+ name=tokenize.tok_name[token[0]],
+ text=repr(token[1]),
+)
+#: Okay
+if os.path.exists(os.path.join(path, PEP8_BIN)):
+ cmd = ([os.path.join(path, PEP8_BIN)] +
+ self._pep8_options(targetfile))
+#: E125
+fixed = re.sub(r'\t+', ' ', target[c::-1], 1)[::-1] + \
+ target[c + 1:]
+#: Okay
+fixed = (re.sub(r'\t+', ' ', target[c::-1], 1)[::-1] +
+ target[c + 1:])
+fixed = (
+ re.sub(r'\t+', ' ', target[c::-1], 1)[::-1] +
+ target[c + 1:]
+)
+#: E126
+if foo is None and bar is "frop" and \
+ blah == 'yeah':
+ blah = 'yeahnah'
+#: Okay
+if foo is None and bar is "frop" and \
+ blah == 'yeah':
+ blah = 'yeahnah'
+#: Okay
+"""This is a multi-line
+ docstring."""
+#: E127
+if (foo is None and bar is "e127" and \
+ blah == 'yeah'):
+ blah = 'yeahnah'
+#: Okay
+if blah:
+ # is this actually readable? :)
+ multiline_literal = """
+while True:
+ if True:
+ 1
+""".lstrip()
+ multiline_literal = (
+ """
+while True:
+ if True:
+ 1
+""".lstrip()
+ )
+ multiline_literal = (
+ """
+while True:
+ if True:
+ 1
+"""
+ .lstrip()
+ )
+#: Okay
+if blah:
+ multiline_visual = ("""
+while True:
+ if True:
+ 1
+"""
+ .lstrip())
View
4 testsuite/E23.py
@@ -7,6 +7,6 @@
b = (5, )
result = {
- 'key1': 'value',
- 'key2': 'value',
+ 'key1': 'value',
+ 'key2': 'value',
}
Something went wrong with that request. Please try again.