Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ast.unparse doesn't observe the new PEP701 string delimiter rules #108469

Closed
2 tasks done
tonybaloney opened this issue Aug 25, 2023 · 8 comments
Closed
2 tasks done

ast.unparse doesn't observe the new PEP701 string delimiter rules #108469

tonybaloney opened this issue Aug 25, 2023 · 8 comments
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@tonybaloney
Copy link
Contributor

tonybaloney commented Aug 25, 2023

Bug report

Checklist

  • I am confident this is a bug in CPython, not a bug in a third-party project
  • I have searched the CPython issue tracker,
    and am confident this bug has not been reported before

CPython versions tested on:

3.12, CPython main branch

Operating systems tested on:

macOS

Output from running 'python -VV' on the command line:

Python 3.12.0b4 (v3.12.0b4:97a6a41816, Jul 11 2023, 11:19:02) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin

A clear and concise description of the bug:

The ast.unparse Unparser doesn't seem to respect PEP701, for example, if you use double quotes in an f-string then unparse the AST it will use a backslash--

>>> import ast
>>> code = 'f" something { my_dict["key"] } something else "'
>>> tree = ast.parse(code)
>>> ast.unparse(tree)
'f" something {my_dict[\'key\']} something else "'

Furthermore, if you use the nested f-string example in the PEP, it crashes completely when unparsing the AST

>>> f"{f"{f"{f"{f"{f"{1+1}"}"}"}"}"}"
'2'
>>> code = 'f"{f"{f"{f"{f"{f"{1+1}"}"}"}"}"}"'
>>> import ast
>>> tree = ast.parse(code)
>>> tree
<ast.Module object at 0x10992fed0>
>>> ast.unparse(tree)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 1777, in unparse
    return unparser.visit(ast_obj)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 859, in visit
    self.traverse(node)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 850, in traverse
    super().visit(node)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 407, in visit
    return visitor(node)
           ^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 874, in visit_Module
    self._write_docstring_and_traverse_body(node)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 867, in _write_docstring_and_traverse_body
    self.traverse(node.body)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 848, in traverse
    self.traverse(item)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 850, in traverse
    super().visit(node)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 407, in visit
    return visitor(node)
           ^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 889, in visit_Expr
    self.traverse(node.value)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 850, in traverse
    super().visit(node)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 407, in visit
    return visitor(node)
           ^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 1240, in visit_JoinedStr
    self._write_fstring_inner(value)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 1268, in _write_fstring_inner
    self.visit_FormattedValue(node)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 1281, in visit_FormattedValue
    raise ValueError(
ValueError: Unable to avoid backslash in f-string expression part

Linked PRs

@tonybaloney tonybaloney added the type-bug An unexpected behavior, bug, or error label Aug 25, 2023
@tonybaloney
Copy link
Contributor Author

CC @pablogsal @isidentical

@AlexWaygood AlexWaygood added stdlib Python modules in the Lib dir 3.12 bugs and security fixes 3.13 bugs and security fixes labels Aug 25, 2023
@pablogsal
Copy link
Member

@isidentical can you take a look?

@tonybaloney
Copy link
Contributor Author

@isidentical
Copy link
Sponsor Member

@isidentical can you take a look?

Yep!

@isidentical
Copy link
Sponsor Member

@tonybaloney just saw the #108553, are you interested on providing a fix in the PR too or is it a test case PR? I'd be happy to help / answer questions regarding the old f-string unparsing logic if you have any.

cpython/Lib/ast.py

Lines 1226 to 1260 in 4116592

def visit_JoinedStr(self, node):
self.write("f")
if self._avoid_backslashes:
with self.buffered() as buffer:
self._write_fstring_inner(node)
return self._write_str_avoiding_backslashes("".join(buffer))
# If we don't need to avoid backslashes globally (i.e., we only need
# to avoid them inside FormattedValues), it's cosmetically preferred
# to use escaped whitespace. That is, it's preferred to use backslashes
# for cases like: f"{x}\n". To accomplish this, we keep track of what
# in our buffer corresponds to FormattedValues and what corresponds to
# Constant parts of the f-string, and allow escapes accordingly.
fstring_parts = []
for value in node.values:
with self.buffered() as buffer:
self._write_fstring_inner(value)
fstring_parts.append(
("".join(buffer), isinstance(value, Constant))
)
new_fstring_parts = []
quote_types = list(_ALL_QUOTES)
for value, is_constant in fstring_parts:
value, quote_types = self._str_literal_helper(
value,
quote_types=quote_types,
escape_special_whitespace=is_constant,
)
new_fstring_parts.append(value)
value = "".join(new_fstring_parts)
quote_type = quote_types[0]
self.write(f"{quote_type}{value}{quote_type}")

@tonybaloney
Copy link
Contributor Author

@tonybaloney just saw the #108553, are you interested on providing a fix in the PR too or is it a test case PR? I'd be happy to help / answer questions regarding the old f-string unparsing logic if you have any.

cpython/Lib/ast.py

Lines 1226 to 1260 in 4116592

def visit_JoinedStr(self, node):
self.write("f")
if self._avoid_backslashes:
with self.buffered() as buffer:
self._write_fstring_inner(node)
return self._write_str_avoiding_backslashes("".join(buffer))
# If we don't need to avoid backslashes globally (i.e., we only need
# to avoid them inside FormattedValues), it's cosmetically preferred
# to use escaped whitespace. That is, it's preferred to use backslashes
# for cases like: f"{x}\n". To accomplish this, we keep track of what
# in our buffer corresponds to FormattedValues and what corresponds to
# Constant parts of the f-string, and allow escapes accordingly.
fstring_parts = []
for value in node.values:
with self.buffered() as buffer:
self._write_fstring_inner(value)
fstring_parts.append(
("".join(buffer), isinstance(value, Constant))
)
new_fstring_parts = []
quote_types = list(_ALL_QUOTES)
for value, is_constant in fstring_parts:
value, quote_types = self._str_literal_helper(
value,
quote_types=quote_types,
escape_special_whitespace=is_constant,
)
new_fstring_parts.append(value)
value = "".join(new_fstring_parts)
quote_type = quote_types[0]
self.write(f"{quote_type}{value}{quote_type}")

@isidentical I'd be happy to submit a PR. I looked at the code and couldn't really work out what it was doing or why though. There are some private methods for escaping strings or using a different string delimiter which I guess would be redundant from 3.12 but couldn't see if that was the only change needed

@isidentical
Copy link
Sponsor Member

There are some private methods for escaping strings or using a different string delimiter which I guess would be redundant from 3.12 but couldn't see if that was the only change needed

I think so. Most of that code was trying to get rid of backslashes and figure out what quote we can use for nested f-strings which doesn't create a problem anymore.

@sunmy2019
Copy link
Member

We'd better move fast. We won't make it to the Sept 4th release. Hurry for Oct 2rd release.

pablogsal pushed a commit that referenced this issue Sep 5, 2023
… [3.12] (#108553)

Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Sep 5, 2023
…PEP701 [3.12] (pythonGH-108553)

(cherry picked from commit 2c4c26c)

Co-authored-by: Anthony Shaw <anthony.p.shaw@gmail.com>
Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>
pablogsal pushed a commit that referenced this issue Sep 5, 2023
… PEP701 [3.12] (GH-108553) (#108960)

Co-authored-by: Anthony Shaw <anthony.p.shaw@gmail.com>
Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants