-
-
Notifications
You must be signed in to change notification settings - Fork 29.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-30465: Fix lineno and col_offset in fstring AST nodes #1800
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
Location information (``lineno`` and ``col_offset``) in f-strings is now | ||
(mostly) correct. This fixes tools like flake8 from showing warnings on the | ||
wrong line (typically the first line of the file). |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,7 @@ | |
#include "node.h" | ||
#include "ast.h" | ||
#include "token.h" | ||
#include "pythonrun.h" | ||
|
||
#include <assert.h> | ||
|
||
|
@@ -4270,6 +4271,55 @@ decode_bytes_with_escapes(struct compiling *c, const node *n, const char *s, | |
return result; | ||
} | ||
|
||
/* Shift locations for the given node and all its children by adding `lineno` | ||
and `col_offset` to existing locations. */ | ||
static void fstring_shift_node_locations(node *n, int lineno, int col_offset) | ||
{ | ||
n->n_col_offset = n->n_col_offset + col_offset; | ||
for (int i = 0; i < NCH(n); ++i) { | ||
if (n->n_lineno && n->n_lineno < CHILD(n, i)->n_lineno) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some nodes don't contain col_offset and lineno information at all. I don't recall now which ones those are but the tests were failing without this additional check. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
/* Shifting column offsets unnecessary if there's been newlines. */ | ||
col_offset = 0; | ||
} | ||
fstring_shift_node_locations(CHILD(n, i), lineno, col_offset); | ||
} | ||
n->n_lineno = n->n_lineno + lineno; | ||
} | ||
|
||
/* Fix locations for the given node and its children. | ||
|
||
`parent` is the enclosing node. | ||
`n` is the node which locations are going to be fixed relative to parent. | ||
`expr_str` is the child node's string representation, incuding braces. | ||
*/ | ||
static void | ||
fstring_fix_node_location(const node *parent, node *n, char *expr_str) | ||
{ | ||
char *substr = NULL; | ||
char *start; | ||
int lines = LINENO(parent) - 1; | ||
int cols = parent->n_col_offset; | ||
/* Find the full fstring to fix location information in `n`. */ | ||
while (parent && parent->n_type != STRING) | ||
parent = parent->n_child; | ||
if (parent && parent->n_str) { | ||
substr = strstr(parent->n_str, expr_str); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if the f-string contains several equivalent subexpressions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then we'll report line number and column information for the first occurence in all occurences. This is not currently solvable without tracking already applied occurences, etc. I don't think it's worth fixing, that would complicate this patch. If we ever redo f-strings to be untokenized properly on first pass, then this will work as expected. The hack to do a separate pass on formatted values is the reason why this entire workaround is necessary in the first place. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried that. Unfortunately, you'd need to pass the entire string as a new argument through all the In other words, this edge case requires an inordinate amount of work to get right. Let's treat this as a separate improvement in the future? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that it's not worth complicating the implementation. Especially because the largest use case is to get error reporting working, and I don't see how the error could be on a second instance of the exact same expression. |
||
if (substr) { | ||
start = substr; | ||
while (start > parent->n_str) { | ||
if (start[0] == '\n') | ||
break; | ||
start--; | ||
} | ||
cols += substr - start; | ||
/* Fix lineno in mulitline strings. */ | ||
while ((substr = strchr(substr + 1, '\n'))) | ||
lines--; | ||
} | ||
} | ||
fstring_shift_node_locations(n, lines, cols); | ||
} | ||
|
||
/* Compile this expression in to an expr_ty. Add parens around the | ||
expression, in order to allow leading spaces in the expression. */ | ||
static expr_ty | ||
|
@@ -4278,6 +4328,7 @@ fstring_compile_expr(const char *expr_start, const char *expr_end, | |
|
||
{ | ||
PyCompilerFlags cf; | ||
node *mod_n; | ||
mod_ty mod; | ||
char *str; | ||
Py_ssize_t len; | ||
|
@@ -4287,9 +4338,10 @@ fstring_compile_expr(const char *expr_start, const char *expr_end, | |
assert(*(expr_start-1) == '{'); | ||
assert(*expr_end == '}' || *expr_end == '!' || *expr_end == ':'); | ||
|
||
/* If the substring is all whitespace, it's an error. We need to catch | ||
this here, and not when we call PyParser_ASTFromString, because turning | ||
the expression '' in to '()' would go from being invalid to valid. */ | ||
/* If the substring is all whitespace, it's an error. We need to catch this | ||
here, and not when we call PyParser_SimpleParseStringFlagsFilename, | ||
because turning the expression '' in to '()' would go from being invalid | ||
to valid. */ | ||
for (s = expr_start; s != expr_end; s++) { | ||
char c = *s; | ||
/* The Python parser ignores only the following whitespace | ||
|
@@ -4315,9 +4367,19 @@ fstring_compile_expr(const char *expr_start, const char *expr_end, | |
str[len+2] = 0; | ||
|
||
cf.cf_flags = PyCF_ONLY_AST; | ||
mod = PyParser_ASTFromString(str, "<fstring>", | ||
Py_eval_input, &cf, c->c_arena); | ||
mod_n = PyParser_SimpleParseStringFlagsFilename(str, "<fstring>", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mentioning There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch, will fix. |
||
Py_eval_input, 0); | ||
if (!mod_n) { | ||
PyMem_RawFree(str); | ||
return NULL; | ||
} | ||
/* Reuse str to find the correct column offset. */ | ||
str[0] = '{'; | ||
str[len+1] = '}'; | ||
fstring_fix_node_location(n, mod_n, str); | ||
mod = PyAST_FromNode(mod_n, &cf, "<fstring>", c->c_arena); | ||
PyMem_RawFree(str); | ||
PyNode_Free(mod_n); | ||
if (!mod) | ||
return NULL; | ||
return mod->v.Expression.body; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about some test that checks the output of eval()? This code doesn't work as-is, because the NameError doesn't include the line number. But I assume there's a way to capture it and inspect it.
I suggest this because this is the behavior that the end user really cares about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, never mind. I was conflating this with bpo-31140.