Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-130775: Validate negative locations in ast #130795

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sobolevn
Copy link
Member

@sobolevn sobolevn commented Mar 3, 2025

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we cannot bypass the ast.c layer and directly end up in the assemble.c, then this should be the correct approach (I didn't know where we validate the locations).

Comment on lines +211 to +214
{'lineno': -2, 'col_offset': 0},
{'lineno': 0, 'col_offset': -2},
{'lineno': 0, 'col_offset': -2, 'end_col_offset': -2},
{'lineno': -2, 'end_lineno': -2, 'col_offset': 0},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Off topic: are we already testing that passing a non-int is fine?
  • Maybe test with values that are excessively large (either in + or -) just for overflow checks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will open a new issue for that, if we don't. Thanks for the idea.

Co-authored-by: Victor Stinner <vstinner@python.org>
Comment on lines +1 to +2
Validate negative :mod:`ast` locations, we only allow ``-1`` as a special
value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's a user-facing change, I think we should also have a What's New entry, and a versionchanged in ast.*. Users may be surprised. I also don't know whether you want to backport the change or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't consider this as a feature, because negative locations do not make sense, they are also documented to be positive numbers in ast.rst

So, I plan to backport :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I would agree that this is a bug fix for a crash in compile, not a new feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a new error for ast.parse though.

Copy link
Member Author

@sobolevn sobolevn Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, looks like ast.parse is not affected:

>>> import ast
... 
... tree = ast.Module(body=[
...     ast.Import(names=[ast.alias(name='traceback', lineno=0, col_offset=0)], lineno=0, \
col_offset=-2)
... ], type_ignores=[])
... 
>>> ast.parse(tree)
Module(body=[Import(names=[alias(name='traceback', asname=None)])], type_ignores=[])
>>> compile(tree, "<string>", "exec")
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    compile(tree, "<string>", "exec")
    ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: AST node has invalid location (0, 0, -2, -2)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you pass optimize=2 to ast.parse?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this works:

>>> ast.parse(tree, optimize=2)
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    ast.parse(tree, optimize=2)
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "/Users/sobolev/Desktop/cpython/Lib/ast.py", line 53, in parse
    return compile(source, filename, mode, flags,
                   _feature_version=feature_version, optimize=optimize)
ValueError: AST node has invalid location (0, 0, -2, -2)

vstinner
vstinner previously approved these changes Mar 3, 2025
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I just suggest a shorter test name :-)

@@ -204,6 +204,27 @@ def test_compilation_of_ast_nodes_with_default_end_position_values(self):
# Check that compilation doesn't crash. Note: this may crash explicitly only on debug mode.
compile(tree, "<string>", "exec")

def test_compilation_of_ast_nodes_with_negative_position_values(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_compilation_of_ast_nodes_with_negative_position_values(self):
def test_invalid_location(self):

@@ -0,0 +1,2 @@
Validate negative :mod:`ast` locations, we only allow ``-1`` as a special
value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change should appear in what's new in 3.14 as a change to ast.parse() because previously this would not generate an error there.

ValueError,
'AST node has invalid location',
):
compile(tree, "<string>", "exec")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd test that ast.parse(tree) raises the error as well.

@@ -297,6 +297,7 @@ write_location_info_entry(struct assembler* a, location loc, int isize)
int line_delta = loc.lineno - a->a_lineno;
int column = loc.col_offset;
int end_column = loc.end_col_offset;
// Values are validated in `VALIDATE_POSITIONS` in `ast.c`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Values are validated in `VALIDATE_POSITIONS` in `ast.c`:

Python/ast.c Outdated
if (node->lineno < -1 || node->end_lineno < -1 || \
node->col_offset < -1 || node->end_col_offset < -1) { \
PyErr_Format(PyExc_ValueError, \
"AST node has invalid location (%d, %d, %d, %d)", \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"AST node has invalid location (%d, %d, %d, %d)", \
"AST node with negative location (%d, %d, %d, %d)", \

@@ -40,6 +40,15 @@ static int validate_typeparam(type_param_ty);
node->col_offset, node->end_col_offset, node->lineno, node->end_lineno); \
return 0; \
} \
if (node->lineno < -1 || node->end_lineno < -1 || \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep allowing -1 I don't see much point in disallowing other values. It's a potential compatibility break for users passing in other invalid values, so we may be better off just making any negative value work.

@vstinner vstinner dismissed their stale review March 7, 2025 16:08

I remove my LGTM vote, I didn't notice that -1 value is still allowed. I'm not sure about rejecting values <= -2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants