Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser does not handle Python3.6 f-strings #4

Closed
LiraNuna opened this issue Jun 5, 2017 · 7 comments
Closed

Parser does not handle Python3.6 f-strings #4

LiraNuna opened this issue Jun 5, 2017 · 7 comments

Comments

@LiraNuna
Copy link
Contributor

LiraNuna commented Jun 5, 2017

An exception is raised when trying to parse python3.6's new f-strings.

  File "ensure_trailing_commas/trailing_comma_finder.py", line 118, in find_missing_trailing_commas
    atok = asttokens.ASTTokens(source_code, parse=True)
  File "asttokens/asttokens/asttokens.py", line 61, in __init__
    self.mark_tokens(self._tree)
  File "asttokens/asttokens/asttokens.py", line 72, in mark_tokens
    MarkTokens(self).visit_tree(root_node)
  File "asttokens/asttokens/mark_tokens.py", line 47, in visit_tree
    util.visit_tree(node, self._visit_before_children, self._visit_after_children)
  File "asttokens/asttokens/util.py", line 148, in visit_tree
    pvalue, post_value = previsit(current, par_value)
  File "asttokens/asttokens/mark_tokens.py", line 51, in _visit_before_children
    token = self._code.get_token_from_utf8(node.lineno, col) if col is not None else None
  File "asttokens/asttokens/asttokens.py", line 123, in get_token_from_utf8
    return self.get_token(lineno, self._line_numbers.from_utf8_col(lineno, col_offset))
  File "asttokens/asttokens/line_numbers.py", line 48, in from_utf8_col
    return offsets[max(0, min(len(offsets), utf8_column))]
IndexError: list index out of range
@dsagal
Copy link
Member

dsagal commented Jun 5, 2017

There is unfortunately an open bug in Python with annotating AST nodes from f-strings with proper line and column info: https://bugs.python.org/issue29051. In addition, the f-string literal is a single token, but the expressions inside would also need to be tokens: it would require some adjustments to when and how we parse tokens; in particular, the standard tokenize module does not currently help there.

I am open to proposals for how to handle it.

@LiraNuna
Copy link
Contributor Author

LiraNuna commented Jun 5, 2017

I think there are two issues here: The exception and the lack of inside-tokens.
I would suggest starting by making the code not throw an exception on valid code.

It seems like even though fstrings are a single token, it should still be possible to use the locations given by the tokenize module:

$ python3.6 -m tokenize test2
0,0-0,0:            ENCODING       'utf-8'        
1,0-5,3:            STRING         "f'''\n{\nFOO\n}\n'''"
5,3-5,4:            NEWLINE        '\n'           
6,0-6,0:            ENDMARKER      ''

Maybe I'm not understanding the bug, however

@dsagal
Copy link
Member

dsagal commented Jun 5, 2017

Actually, could you share the full test case that produces the error. I am not seeing it with a just a single f-string literal.

@LiraNuna
Copy link
Contributor Author

LiraNuna commented Jun 6, 2017

Okay so after experimentation it seems like this is a min case:

def t():
    return f'{function(kwarg=24)}'

will give:

  File "asttokens/line_numbers.py", line 48, in from_utf8_col
    return offsets[max(0, min(len(offsets), utf8_column))]
IndexError: list index out of range

but

f'{function(kwarg=24)}'

alone will give a different exception:

  File "asttokens/util.py", line 58, in expect_token
    token.start[0], token.start[1] + 1))
ValueError: Expected token NAME:'kwarg', got STRING:"f'{function(kwarg=24)}'" on line 1 col 1

Most likely happening because of the same issue, but reporting both instances since they "look" different

dsagal added a commit that referenced this issue Jun 6, 2017
@dsagal
Copy link
Member

dsagal commented Jun 6, 2017

I pushed a fix to master, with a test case that includes your examples. The fix is simply to avoid descending inside an f-string, since we have neither locations nor tokens for the inner nodes.

Please check that it solves your issue. You can keep this bug open, or close and create a new one. Either way, the remaining bug is that asttokens does not include or annotate the AST subtrees inside fstrings.

@LiraNuna
Copy link
Contributor Author

LiraNuna commented Jun 6, 2017

The fix is simply to avoid descending inside an f-string

I'm guessing this explains this exception? AttributeError: 'Call' object has no attribute 'first_token' (Maybe assign None instead?)

That's a fair resolution for now since python lacks the ability to parse it. Hopefully when this issue resolves, this library would be able to improve.

Thanks for the fast resolution!

P.S maybe a version bump is in order?

@LiraNuna LiraNuna closed this as completed Jun 6, 2017
LiraNuna added a commit to LiraNuna/py-ensure-trailing-commas that referenced this issue Jun 6, 2017
Because of a python 3.6.1 bug described in https://bugs.python.org/issue29051, asttokens currently does not descend these nodes.
Therefore, we override the visit method to completely ignore those nodes to avoid an exception.

More information about this resolution at gristlabs/asttokens#4
@dsagal
Copy link
Member

dsagal commented Jun 6, 2017

Version bumped, and uploaded package with fix to pypi.

The exception you mention -- yeah, I guess it makes sense. Hmm... With my workaround, I don't visit those nodes at all. To assign None, I'd have to visit them, but remember that they are under a JoinedStr and skip most of the processing. That's kind of annoying. How much of an issue is it? Might it make sense to let it hang until we can just solve it (issue #6) properly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants