Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caret positioned wrong for SyntaxError reported by ast.c #78864

Closed
gvanrossum opened this issue Sep 14, 2018 · 10 comments
Closed

Caret positioned wrong for SyntaxError reported by ast.c #78864

gvanrossum opened this issue Sep 14, 2018 · 10 comments
Labels
3.7 (EOL) end of life 3.8 (EOL) end of life

Comments

@gvanrossum
Copy link
Member

BPO 34683
Nosy @gvanrossum, @ericvsmith, @benjaminp, @emilyemorehouse, @ammaraskar, @pablogsal, @tirkarthi
PRs
  • bpo-34683: Make SyntaxError column offsets consistently 1-indexed #9338
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-09-24.21:13:59.977>
    created_at = <Date 2018-09-14.17:22:22.118>
    labels = ['3.7', '3.8']
    title = 'Caret positioned wrong for SyntaxError reported by ast.c'
    updated_at = <Date 2018-09-25.12:50:04.568>
    user = 'https://github.com/gvanrossum'

    bugs.python.org fields:

    activity = <Date 2018-09-25.12:50:04.568>
    actor = 'ammar2'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-09-24.21:13:59.977>
    closer = 'gvanrossum'
    components = []
    creation = <Date 2018-09-14.17:22:22.118>
    creator = 'gvanrossum'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 34683
    keywords = ['patch']
    message_count = 10.0
    messages = ['325366', '325375', '325380', '325382', '325463', '326286', '326287', '326288', '326320', '326351']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'eric.smith', 'benjamin.peterson', 'emilyemorehouse', 'ammar2', 'pablogsal', 'xtreak']
    pr_nums = ['9338']
    priority = 'low'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue34683'
    versions = ['Python 3.4', 'Python 3.5', 'Python 3.6', 'Python 3.7', 'Python 3.8']

    @gvanrossum
    Copy link
    Member Author

    Input file with a subtle error: a number where an assignment target is required:

    for 1 in []: pass

    Run it, it gives a SyntaxError. Note how the caret pointing to the incorrect token is position one to the left of where you'd expect it:

    File "s.py", line 1
    for 1 in []: pass
    ^
    SyntaxError: can't assign to literal

    For every syntax error I've seen that's produced by ast.c this seems to be the case -- the caret is always positioned 1 too far left.

    I tried to understand how this is happening but my AST-fu is lacking. It seems this has been happening since ast.c started added column numbers -- in Python 2.7 there's no caret at all, but in 3.4 and later there's a caret and it has the same problem. (Also in 3.3; I don't have 3.2 or older 3.x lying around to test.)

    @gvanrossum gvanrossum added 3.7 (EOL) end of life 3.8 (EOL) end of life labels Sep 14, 2018
    @ericvsmith
    Copy link
    Member

    I'm doing some fairly major surgery on line and column numbers for fixing f-string errors. I'll see if I can include this case, too.

    @benjaminp
    Copy link
    Contributor

    I think ast.c is in the right here and there are two complementary bugs in caret printing and the parser.

    print_error_text does this:

        while (--offset > 0)
            PyFile_WriteString(" ", f);

    which is off-by-one if offset is zero-indexed (which it should be IMO).

    The parser has an off-by-one error in the other direction. When it reports an error, it reports the offset of the tokenizer's "cur" buffer. This is generally advanced past the location that actually resulted in a parser error. I believe the parser should be fixed to report the error offset as the offset of the start of the token that caused the error.

    @ericvsmith
    Copy link
    Member

    I think Benjamin's diagnosis is correct. In which case, my f-string error reporting changes (see bpo-34364) won't intersect with a fix to this issue.

    @ammaraskar
    Copy link
    Member

    Added a PR that is an implementation of Benjamin's suggestion. The column offset returned has been made 0-indexed and errors now point to the start of the parsed token.

    This makes errors like def class show up as

    def class
    ^

    instead of

    def class
    ^

    Also added tests for the original example and some more errors from ast.c

    @gvanrossum
    Copy link
    Member Author

    New changeset 025eb98 by Guido van Rossum (Ammar Askar) in branch 'master':
    bpo-34683: Make SyntaxError column offsets consistently 1-indexed (gh-9338)
    025eb98

    @gvanrossum
    Copy link
    Member Author

    Fixed by #53584. Thanks Ammar!

    @gvanrossum
    Copy link
    Member Author

    Hm, should this be backported? I think it's safest not to.

    @benjaminp
    Copy link
    Contributor

    Wait, why did you make offsets 1-indexed? My suggestion was to make everything zero indexed...

    @ammaraskar
    Copy link
    Member

    There's some context on the github PR but essentially it boiled down to:

    • It would make it inconsistent with line offsets which are 1-indexed.
    • Breaking change to a public C API (PyErr_SyntaxLocationObject)
    • IDLE and other tools that catch syntax errors rely on it being 1-indexed and this would be a breaking change for them. As a quick example of that I found this vim plugin: vim-syntastic/syntastic@6c91e8d

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 (EOL) end of life
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants