Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of invalid conversion characters #61

Conversation

isidentical
Copy link
Sponsor Collaborator

No description provided.

@isidentical
Copy link
Sponsor Collaborator Author

We only have a single f-string conversion character related error left after this PR which is the following: "f'{3!ª}'". The problem here is that, it used to fail because the old f-string parser did not normalize identifiers (or rather treat the character after the ! as an identifier) . Now we treat it as an identifier and when we are checking if it is a/s/r, we don't actually get to see the real token's contents (since pegen already decodes it for us and gives us the normalized version).

We can make them soft keywords (? I think they aren't normalized but not super sure, @pablogsal or @lysnikolaou you can probably correct me on this) OR we simply allow this edge case (or hard-code it in the tokenizer which might look a bit ugly I think).

@isidentical isidentical marked this pull request as ready for review March 28, 2023 21:11
@pablogsal
Copy link
Owner

We only have a single f-string conversion character related error left after this PR which is the following: "f'{3!ª}'". The problem here is that, it used to fail because the old f-string parser did not normalize identifiers (or rather treat the character after the ! as an identifier) . Now we treat it as an identifier and when we are checking if it is a/s/r, we don't actually get to see the real token's contents (since pegen already decodes it for us and gives us the normalized version).

We can make them soft keywords (? I think they aren't normalized but not super sure, @pablogsal or @lysnikolaou you can probably correct me on this) OR we simply allow this edge case (or hard-code it in the tokenizer which might look a bit ugly I think).

Let's simply allow the edge case, I don't want to add crazy complexity just for this particular case.

@pablogsal pablogsal merged commit 59ec33d into pablogsal:fstring-grammar-rebased-after-sprint Mar 29, 2023
@pablogsal
Copy link
Owner

Hummm, actually this PR makes a lot more tests fail so I think I am going to revert it for now

@pablogsal
Copy link
Owner

30 failures -> 52 failures

@isidentical
Copy link
Sponsor Collaborator Author

Uh, in my local branch it was something like 28 to 24? (On test_fstring). Will double check

@isidentical
Copy link
Sponsor Collaborator Author

@pablogsal I can't seem to reproduce your findings 🤔 Is there a chance I'm missing something when trying this?

Before (current revision at the fstring-grammar-rebased-after-sprint)

❯ ./python -m test test_fstring
test test_fstring failed -- multiple errors occurred; run in verbose mode for details
test_fstring failed (30 failures)

== Tests result: FAILURE ==

1 test failed:
    test_fstring

Total duration: 863 ms
Tests result: FAILURE

After (actually before your revert / after this PR):

❯ git checkout HEAD^
HEAD is now at e1940e5e77 Fix test_type_comments
❯ [...]
❯ ./python -m test test_fstring
test test_fstring failed -- multiple errors occurred; run in verbose mode for details
test_fstring failed (24 failures)

== Tests result: FAILURE ==

1 test failed:
    test_fstring

Total duration: 820 ms
Tests result: FAILURE

isidentical added a commit to isidentical/cpython that referenced this pull request Mar 30, 2023
Note: this time we are allowing normalized unicode identifiers as conversion
characters as per the discussion in pablogsal#61 (comment)
@pablogsal
Copy link
Owner

Hummmm, let me check in the main CI if I revert your revert

@pablogsal
Copy link
Owner

Does #62 include this PR as well?

@isidentical
Copy link
Sponsor Collaborator Author

Yep, with the above edge case fixed as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants