-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Python 3.4 gives wrong col_offset for Call nodes returned from ast.parse #65494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Following program gives correct result in Python versions older than 3.4, but incorrect result in 3.4: ---------------------- import ast
tree = ast.parse("sin(0.5)")
first_stmt = tree.body[0]
call = first_stmt.value
print("col_offset of call expression:", call.col_offset)
print("col_offset of func of the call:", call.func.col_offset) it should print: but in 3.4 it prints: |
... also, lineno is wrong for both Call and call's func, when func and arguments are on different lines: import ast
tree = ast.parse("(sin\n(0.5))")
first_stmt = tree.body[0]
call = first_stmt.value
print("col_offset of call expression:", call.col_offset)
print("col_offset of func of the call:", call.func.col_offset)
print("lineno of call expression:", call.lineno)
print("lineno of func of the call:", call.lineno) # lineno-s should be 1 for both call and func |
I suspect this was an intentional result of bpo-16795. |
Regarding bpo-16795, the documentation says "The lineno is the line number of source text and the col_offset is the UTF-8 byte offset of the first token that generated the node", not that lineno and col_offset indicate a suitable position to mention in the error messages related to this node. IMO lineno and col_offset should stay as predictable means for finding the (beginning of) source text of the node. In error reporting code one could inspect the situation and compute locations suitable for this. Alternatively, these attributes could be left for purposes mentioned in bpo-16795 and parser developers could introduce new attributes in ast nodes which indicate both start and end positions of corresponding source. (Hopefully this would resolve also bpo-18374 and bpo-16806) |
Just found out that ast.Attribute in Python 3.4 has similar problem |
This is caused by https://hg.python.org/cpython/rev/7c5c678e4164/ Could we just revert https://hg.python.org/cpython/rev/7c5c678e4164/ ? |
It is now very hard to determine accurate locations for an expression such as (x+y).attr as the column offset of leftmost subexpression of the expression is not the same as the column offset of the location. |
This also breaks the col_offset for subscripts like x[y] and, of course any statement with one of these expressions as its leftmost sub-expression. |
New changeset 7d1c32ddc432 by Benjamin Peterson in branch '3.4': New changeset 8ab6b404248c by Benjamin Peterson in branch 'default': |
Why did you not CC me in this discussion? It is not very nice to have this behaviour changed back from what I relied upon in a minor version without notice. Which regression was effectively caused by this patch, except for the documentation being out of date? |
You are on the nosy list. You should have got sent an email. This bug is the regression. |
Hmm, strange, I did not receive any emails. "Incorrect" by what definition of incorrect? The word does not really help to clarify the issue you see with this change, since the behaviour was changed on purpose. What is the (preferably real-world) application which is broken by this change? |
The column offset has always been the offset of the start of the expression. Therefore the expression Our static analysis tool is a real-world use case: Presumably the submitter of this issue also had a real would use case. |
Yes, I also need col_offset to work as advertised because of a real world use case: Thonny (http://thonny.cs.ut.ee/) is a visual Python debugger which highlights the (sub)expression about to be evaluated. |
But if you need the start of the full expression, can't you just go up in the "parent" chain until the parent is not an expression any more? Could additional API be introduced which provides the value I am looking for as well as the one you need? I was not on the nosy list by the way, I just put myself there after I commented. And that was after 3.4.3, after I noticed my software was suddenly broken by a patch release of python. |
How do I get the start of The primary purpose of the locations are for tracebacks, not for static tools. A third-party parser that supported full, accurate locations would be great, but I don't think the builtin parser is the place for it. |
I've ran the tests from first and second comment using python 3.5.0 and it seems it produces correct results: >>> import ast
>>> tree = ast.parse("sin(0.5)")
>>> first_stmt = tree.body[0]
>>> call = first_stmt.value
>>> print("col_offset of call expression:", call.col_offset)
col_offset of call expression: 0
>>> print("col_offset of func of the call:", call.func.col_offset)
col_offset of func of the call: 0
>>> tree = ast.parse("(sin\n(0.5))")
>>> first_stmt = tree.body[0]
>>> call = first_stmt.value
>>> print("col_offset of call expression:", call.col_offset)
col_offset of call expression: 1
>>> print("col_offset of func of the call:", call.func.col_offset)
col_offset of func of the call: 1
>>> print("lineno of call expression:", call.lineno)
lineno of call expression: 1
>>> print("lineno of func of the call:", call.lineno)
lineno of func of the call: 1 |
There is still problem with col_offset is some situations, for example col_offset of the ast.Attribute should be 4 but is 0 instead: >>> for x in ast.walk(ast.parse('foo.bar')):
... if hasattr(x, 'col_offset'):
... print("%s: %d" % (x, x.col_offset))
...
<_ast.Expr object at 0x7fcdc84722b0>: 0
<_ast.Attribute object at 0x7fcdc84723c8>: 0
<_ast.Name object at 0x7fcdc8472438>: 0 Is there any solution to this problem? It causes problems in python support in KDevelop (kdev-python). |
Radek, the source corresponding to Attribute node does start at col 0 in your example |
Aivar, I have to admit that my knowledge of this is limited, but as I understand it, the attribute is "bar" in the "foo.bar" expression. I can get beginning of the assignment by
>>> ast.parse('foo.bar').body[0].value.value.col_offset
0
But how can I get position of the 'bar'? My guess is this:
>>> ast.parse('foo.bar').body[0].value.col_offset
but it still returns 0. Why this two col_offsets returns the same value? How can I get the position of 'bar' in 'foo.bar'? |
ast.Attribute node actually means "the atribute of something", ie. the node includes this "something" as subnode.
I don't know a good way for this, because bar is not an AST node for Python. If Python AST nodes included the information about where a node ends in source, I would take the ending col of node.value (foo in your example), and added 2. In my own program (http://thonny.cs.ut.ee, it's a Python IDE for beginners) I'm using a really contrived algorithm for determining the end positions of nodes. See function mark_text_ranges here: https://bitbucket.org/plas/thonny/src/b8860704c99d47760ffacfaa335d2f8772721ba4/thonny/ast_utils.py?at=master&fileviewer=file-view-default I'm not happy with my solution, but I don't know any other ways. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: