Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-46503: Prevent an assert from firing when parsing some invalid \N sequences in f-strings. #30865

Merged
merged 2 commits into from Jan 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions Lib/test/test_fstring.py
Expand Up @@ -746,12 +746,16 @@ def test_misformed_unicode_character_name(self):
# differently inside f-strings.
self.assertAllRaise(SyntaxError, r"\(unicode error\) 'unicodeescape' codec can't decode bytes in position .*: malformed \\N character escape",
[r"f'\N'",
r"f'\N '",
r"f'\N '", # See bpo-46503.
r"f'\N{'",
r"f'\N{GREEK CAPITAL LETTER DELTA'",

# Here are the non-f-string versions,
# which should give the same errors.
r"'\N'",
r"'\N '",
r"'\N '",
r"'\N{'",
r"'\N{GREEK CAPITAL LETTER DELTA'",
])
Expand Down
@@ -0,0 +1 @@
Fix an assert when parsing some invalid \N escape sequences in f-strings.
16 changes: 14 additions & 2 deletions Parser/string_parser.c
Expand Up @@ -442,12 +442,23 @@ fstring_find_literal(Parser *p, const char **str, const char *end, int raw,
if (!raw && ch == '\\' && s < end) {
ch = *s++;
if (ch == 'N') {
/* We need to look at and skip matching braces for "\N{name}"
sequences because otherwise we'll think the opening '{'
starts an expression, which is not the case with "\N".
Keep looking for either a matched '{' '}' pair, or the end
of the string. */

if (s < end && *s++ == '{') {
while (s < end && *s++ != '}') {
}
continue;
}
break;

/* This is an invalid "\N" sequence, since it's a "\N" not
followed by a "{". Just keep parsing this literal. This
error will be caught later by
decode_unicode_with_escapes(). */
continue;
}
if (ch == '{' && warn_invalid_escape_sequence(p, ch, t) < 0) {
return -1;
Expand Down Expand Up @@ -491,7 +502,8 @@ fstring_find_literal(Parser *p, const char **str, const char *end, int raw,
*literal = PyUnicode_DecodeUTF8Stateful(literal_start,
s - literal_start,
NULL, NULL);
} else {
}
else {
gpshead marked this conversation as resolved.
Show resolved Hide resolved
*literal = decode_unicode_with_escapes(p, literal_start,
s - literal_start, t);
}
Expand Down