Skip to content

Conversation

@pablogsal
Copy link
Member

@pablogsal pablogsal commented Jun 29, 2024

@pablogsal
Copy link
Member Author

@lysnikolaou do you mind taking a look at this approach? I think is not too bad and it solidifies a bit more the concept of being in a format spec

conversion_val = (int)'r';
}

expr_ty format_expr = format ? (expr_ty) format->result : NULL;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This here is because when there are debug expressions we are wrapping the values in a node that then is wrapped in turn, but that is not what 3.11 and before does so we need to detect that and unwrap.

Copy link
Member

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach, although it seems like the unpacking does not work correctly.

return MAKE_TOKEN(_PyTokenizer_syntaxerror(tok, "f-string: expressions nested too deeply"));
}
TOK_GET_MODE(tok)->kind = TOK_REGULAR_MODE;
current_tok->in_format_spec = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this always be 0 here, if we've correctly reset it when the format spec is ending?

@pablogsal
Copy link
Member Author

@lysnikolaou Ugh... this is much more complicated indeed. We need to restructure how we propagate the format specifier and how we capture the f-string expressions on format specifiers :(

@pablogsal pablogsal marked this pull request as draft July 1, 2024 15:16
@pablogsal
Copy link
Member Author

Also the AST is so weird because it removes {. Check this out:

python3.11 -m ast <<< 'f"{datetime.datetime.now():h1{y2=}}"'
Module(
   body=[
      Expr(
         value=JoinedStr(
            values=[
               FormattedValue(
                  value=Call(
                     func=Attribute(
                        value=Attribute(
                           value=Name(id='datetime', ctx=Load()),
                           attr='datetime',
                           ctx=Load()),
                        attr='now',
                        ctx=Load()),
                     args=[],
                     keywords=[]),
                  conversion=-1,
                  format_spec=JoinedStr(
                     values=[
                        Constant(value='h1y2='),
                        FormattedValue(
                           value=Name(id='y2', ctx=Load()),
                           conversion=114)]))]))],
   type_ignores=[])

notice that the node is h1y2 without the { in the middle :S

@pablogsal pablogsal marked this pull request as ready for review July 1, 2024 20:45
@pablogsal
Copy link
Member Author

I think this should do the trick. Boy this was hard to fix :S

@pablogsal
Copy link
Member Author

Hummm I will investigate the failure soon

res = _PyAST_JoinedStr(spec, lineno, col_offset, end_lineno,
end_col_offset, p->arena);
} else {
res = _PyPegen_concatenate_strings(p, spec,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is so we merge and concatenate the Constant and JoinedStr nodes. See how the tree originally looks like here: #121150 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the tree in #121150 is false, but also running that same example with the latest status of your branch gives me the same exact one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion. That tree is the correct one, but without this call it basically has a bunch of Constants all together and the Joined strings are nested (comment the code and check it out to see what I mean).

}
n_flattened_elements++;
break;
case JoinedStr_kind:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is to accommodate the case when we call this function with other things than JoinedStr and constants

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this happen? It probably shouldn't, but I may be missing something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I am flattening collections that contain new nodes in https://github.com/python/cpython/pull/121150/files#r1662410566 (it's a new call over a new array that was not happening before)

while (end_quote_size != current_tok->f_string_quote_size) {
int c = tok_nextc(tok);
if (tok->done == E_ERROR) {
if (tok->done == E_ERROR || tok->done == E_DECODE) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug, as if we enter E_DECODE state we were not returning soon enough

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently came across this in another branch I'm working on. If we check for this here, we can probably remove the if (tok->decoding_erred) about 10 lines below, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will do

Copy link
Member

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments/questions.

res = _PyAST_JoinedStr(spec, lineno, col_offset, end_lineno,
end_col_offset, p->arena);
} else {
res = _PyPegen_concatenate_strings(p, spec,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the tree in #121150 is false, but also running that same example with the latest status of your branch gives me the same exact one.

}
n_flattened_elements++;
break;
case JoinedStr_kind:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this happen? It probably shouldn't, but I may be missing something.

while (end_quote_size != current_tok->f_string_quote_size) {
int c = tok_nextc(tok);
if (tok->done == E_ERROR) {
if (tok->done == E_ERROR || tok->done == E_DECODE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently came across this in another branch I'm working on. If we check for this here, we can probably remove the if (tok->decoding_erred) about 10 lines below, right?

// brackets, we can bypass it here.
if (peek == '}' && !in_format_spec) {
int cursor = current_tok->curly_bracket_depth;
if (peek == '}' && !in_format_spec && cursor == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the cursor check here?

pablogsal and others added 2 commits July 4, 2024 15:32
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
@pablogsal
Copy link
Member Author

@lysnikolaou can you take another look so we can get this into rc1?

@pablogsal pablogsal added needs backport to 3.13 bugs and security fixes needs backport to 3.12 only security fixes labels Jul 16, 2024
@pablogsal
Copy link
Member Author

I'm landing so this gets into the last 3.13 beta to get some coverage. We can fix anything @lysnikolaou doesn't like later

@pablogsal pablogsal merged commit c46d64e into python:main Jul 16, 2024
@miss-islington-app
Copy link

Thanks @pablogsal for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Sorry, @pablogsal, I could not cleanly backport this to 3.13 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker c46d64e0ef8e92a6b4ab4805d813d7e4d6663380 3.13

@miss-islington-app
Copy link

Sorry, @pablogsal, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker c46d64e0ef8e92a6b4ab4805d813d7e4d6663380 3.12

@bedevere-app
Copy link

bedevere-app bot commented Jul 16, 2024

GH-121868 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Jul 16, 2024
pablogsal added a commit to pablogsal/cpython that referenced this pull request Jul 16, 2024
…ressions (pythonGH-121150)

(cherry picked from commit c46d64e)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
pablogsal added a commit that referenced this pull request Jul 16, 2024
pablogsal added a commit to pablogsal/cpython that referenced this pull request Jul 20, 2024
…ressions (pythonGH-121150)

(cherry picked from commit c46d64e)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
@bedevere-app
Copy link

bedevere-app bot commented Jul 20, 2024

GH-122063 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 only security fixes label Jul 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants