Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PEP701 support #3822

Merged
merged 91 commits into from
Apr 22, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
48ad67c
Add PEP701 support
tusharsadhwani Jul 29, 2023
175942b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 29, 2023
fb2943b
Merge branch 'main' into fstrings-pep701
tusharsadhwani Jul 29, 2023
9e344f4
Add FSTRING_START and FSTRING_MIDDLE tokenizing
tusharsadhwani Aug 13, 2023
dbdb02c
Support escaping of `{{`
tusharsadhwani Aug 15, 2023
5acb397
typo
tusharsadhwani Aug 15, 2023
20d7497
Merge branch 'main' into fstrings-pep701
tusharsadhwani Aug 28, 2023
4a69ffa
fix some problems with triple quoted strings
tusharsadhwani Aug 27, 2023
ee30cde
Add support for FSTRING_MIDDLE and FSTRING_END
tusharsadhwani Aug 27, 2023
e7b5850
bugfix and simplify the regexes
tusharsadhwani Aug 29, 2023
88af1c1
Fix small regex problems
tusharsadhwani Sep 6, 2023
c1ecc14
fix newline type
tusharsadhwani Sep 6, 2023
644c5cc
turn endprog into endprog_stack
tusharsadhwani Sep 10, 2023
b23cdfd
Support fstrings with no braces
tusharsadhwani Sep 11, 2023
bbbac0a
Add grammar changes
tusharsadhwani Sep 17, 2023
dadaa64
fix some locations
tusharsadhwani Sep 18, 2023
a57e404
remove padding from fstring_middle and fstring_end
tusharsadhwani Sep 18, 2023
fff25fb
Fix some positions
tusharsadhwani Sep 18, 2023
95cd0ba
fix edge cases with padding
tusharsadhwani Sep 18, 2023
caafa75
fix nested fstrings bug
tusharsadhwani Sep 20, 2023
838f627
Fix bugs in multiline fstrings
tusharsadhwani Sep 20, 2023
f5abd4b
support fstring_middle ending with newline
tusharsadhwani Sep 20, 2023
fd3e5e1
fix edge case for triple quoted strings
tusharsadhwani Sep 23, 2023
0c69069
Add string normalization
tusharsadhwani Sep 23, 2023
c4d457e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 23, 2023
ace80e0
small bugfixes
tusharsadhwani Sep 23, 2023
c0a99c8
fix some bugs that I introduced just now
tusharsadhwani Sep 23, 2023
b02cf2a
strings and fstrings can have implicit concat
tusharsadhwani Sep 23, 2023
acd3c79
don't normalize docstring prefixes
tusharsadhwani Sep 23, 2023
5bca062
Add !r format specifier support
tusharsadhwani Sep 23, 2023
b755281
Support non nested format specifiers
tusharsadhwani Sep 24, 2023
8f7ecdf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 24, 2023
00dc7ac
fix walrus edge case
tusharsadhwani Sep 24, 2023
306b9e9
empty FSTRING_MIDDLE should not be truncated
tusharsadhwani Sep 24, 2023
7323840
support rf" tokens
tusharsadhwani Sep 24, 2023
4fc656d
fix fstring feature detection
tusharsadhwani Sep 24, 2023
ea70516
fix edge cases in format specifier tokenizing
tusharsadhwani Sep 30, 2023
4b80fe1
fix that one bug with depending on parenlev
tusharsadhwani Oct 1, 2023
420867d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 1, 2023
160ef4e
fix line location for triple quoted strings
tusharsadhwani Oct 2, 2023
edf3d79
try to fix mypy errors
tusharsadhwani Oct 2, 2023
23bee77
commit unstaged change
tusharsadhwani Oct 2, 2023
6997d14
Add `fstring_format_spec` to symbols
tusharsadhwani Oct 2, 2023
4e201fc
fix possible cause of mypyc crash
tusharsadhwani Oct 2, 2023
3e39af1
Merge branch 'main' into fstrings-pep701
tusharsadhwani Oct 2, 2023
17a9063
Fix edge case with wrapping format specs
tusharsadhwani Oct 6, 2023
c1f5e82
Merge branch 'main' into fstrings-pep701
tusharsadhwani Oct 8, 2023
d0af0c1
Add FSTRING_PARSING as a feature
tusharsadhwani Oct 14, 2023
48348fa
Merge branch 'main' into fstrings-pep701
tusharsadhwani Oct 14, 2023
78c1e9c
Add test case
tusharsadhwani Oct 15, 2023
6931c92
Add two todos in test case
tusharsadhwani Oct 15, 2023
53ca71c
tiny changes
tusharsadhwani Oct 15, 2023
188583c
Merge branch 'main' into fstrings-pep701
tusharsadhwani Dec 12, 2023
442228d
Merge branch 'main' into fstrings-pep701
hauntsaninja Jan 6, 2024
e97dd01
fix merge
hauntsaninja Jan 6, 2024
cf9b415
Update src/black/strings.py
hauntsaninja Jan 6, 2024
2429e72
Merge branch 'main' into fstrings-pep701
tusharsadhwani Jan 12, 2024
1318b32
Merge branch 'main' into fstrings-pep701
JelleZijlstra Feb 12, 2024
9737159
changelog
JelleZijlstra Feb 12, 2024
e220c10
Lint, remove unused function
JelleZijlstra Feb 12, 2024
7ef92db
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 12, 2024
62e0b2b
fix debug visitor test
tusharsadhwani Mar 17, 2024
df38ea0
fix most tests
tusharsadhwani Mar 25, 2024
150a4fe
fix whitespace getting removed after fstring colon
tusharsadhwani Mar 25, 2024
c4487cb
remove unnecessary continue
tusharsadhwani Mar 25, 2024
97640eb
Merge branch 'main' into fstrings-pep701
tusharsadhwani Mar 25, 2024
ece7452
don't use removeprefix
tusharsadhwani Mar 25, 2024
dfd3455
formatting
tusharsadhwani Mar 25, 2024
a81bae3
add minimum version
tusharsadhwani Mar 25, 2024
0435144
fix the one failing test
tusharsadhwani Mar 27, 2024
99f8eb7
fix couple more bugs
tusharsadhwani Mar 28, 2024
3e56204
don't format fstrings at all
tusharsadhwani Apr 2, 2024
9495f5e
address comments
tusharsadhwani Apr 2, 2024
89a4d71
Merge branch 'main' into fstrings-pep701
tusharsadhwani Apr 2, 2024
cf76482
flake8
tusharsadhwani Apr 2, 2024
bbff3de
fix failing test
tusharsadhwani Apr 2, 2024
0fef83c
undo default change
tusharsadhwani Apr 2, 2024
c570360
remove todo
tusharsadhwani Apr 2, 2024
2a697c8
fix: \N{} case
tusharsadhwani Apr 4, 2024
a5f943b
make test a little better
tusharsadhwani Apr 8, 2024
1ab815b
tweak regex to fix edge cases
tusharsadhwani Apr 14, 2024
324cacb
Merge branch 'main' into fstrings-pep701
tusharsadhwani Apr 14, 2024
019df7b
fix edge case with nested multiline strings
tusharsadhwani Apr 22, 2024
40c9890
Merge branch 'main' into fstrings-pep701
tusharsadhwani Apr 22, 2024
a64939d
whitespace
tusharsadhwani Apr 22, 2024
7df45fb
fix multiline formatspec todo
tusharsadhwani Apr 22, 2024
36e04d2
add another test case
tusharsadhwani Apr 22, 2024
25941cd
Merge branch 'main' into fstrings-pep701
tusharsadhwani Apr 22, 2024
eb05cd4
Revert "Remove node-specific logic from visit_default (#4321)"
tusharsadhwani Apr 22, 2024
5d727ec
Revert "Revert "Remove node-specific logic from visit_default (#4321)""
JelleZijlstra Apr 22, 2024
ab2f43c
fix
JelleZijlstra Apr 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 4 additions & 1 deletion src/blib2to3/pgen2/token.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,10 @@
ASYNC: Final = 57
ERRORTOKEN: Final = 58
COLONEQUAL: Final = 59
N_TOKENS: Final = 60
FSTRING_START: Final = 60
FSTRING_MIDDLE: Final = 61
FSTRING_END: Final = 62
N_TOKENS: Final = 63
NT_OFFSET: Final = 256
# --end constants--

Expand Down
113 changes: 105 additions & 8 deletions src/blib2to3/pgen2/tokenize.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
function to which the 5 fields described above are passed as 5 arguments,
each time a new token is found."""

import io
import sys
from typing import (
Callable,
Expand All @@ -50,12 +51,17 @@
DEDENT,
ENDMARKER,
ERRORTOKEN,
FSTRING_END,
FSTRING_MIDDLE,
FSTRING_START,
INDENT,
LBRACE,
NAME,
NEWLINE,
NL,
NUMBER,
OP,
RBRACE,
STRING,
tok_name,
)
Expand All @@ -66,7 +72,7 @@
import re
from codecs import BOM_UTF8, lookup

from . import token
from blib2to3.pgen2 import token

__all__ = [x for x in dir(token) if x[0] != "_"] + [
"tokenize",
Expand Down Expand Up @@ -468,10 +474,12 @@ def generate_tokens(
raise TokenError("EOF in multi-line string", strstart)
endmatch = endprog.match(line)
if endmatch:
endquote = endmatch.group(0)
pos = end = endmatch.end(0)
yield (
STRING,
yield from tokenize_string(
contstr + line[:end],
startquote,
endquote,
strstart,
(lnum, end),
contline + line,
Expand Down Expand Up @@ -590,15 +598,19 @@ def generate_tokens(
stashed = None
yield (COMMENT, token, spos, epos, line)
elif token in triple_quoted:
endprog = endprogs[token]
startquote = token
endprog = endprogs[startquote]
endmatch = endprog.match(line, pos)
if endmatch: # all on one line
endquote = endmatch.group(0)
pos = endmatch.end(0)
token = line[start:pos]
if stashed:
yield stashed
stashed = None
yield (STRING, token, spos, (lnum, pos), line)
yield from tokenize_string(
token, startquote, endquote, spos, (lnum, pos), line
)
else:
strstart = (lnum, start) # multiple lines
contstr = line[start:]
Expand Down Expand Up @@ -627,7 +639,18 @@ def generate_tokens(
if stashed:
yield stashed
stashed = None
yield (STRING, token, spos, epos, line)

if initial in single_quoted:
startquote = initial
elif token[:2] in single_quoted:
startquote = token[:2]
else:
startquote = token[:3]

endquote = token[-1]
yield from tokenize_string(
token, startquote, endquote, spos, epos, line
)
elif initial.isidentifier(): # ordinary name
if token in ("async", "await"):
if async_keywords or async_def:
Expand Down Expand Up @@ -694,8 +717,82 @@ def generate_tokens(
yield (ENDMARKER, "", (lnum, 0), (lnum, 0), "")


def tokenize_string(
string: str,
startquote: str,
endquote: str,
startpos: Coord,
endpos: Coord,
line: str,
) -> GoodTokenInfo:
if not string.startswith(("f", "F")):
# regular strings can still be returned as usual
yield (STRING, string, startpos, endpos, line)
return

lnum = startpos[0]
yield (FSTRING_START, startquote, startpos, (lnum, len(startquote)), line)
pos = len(startquote)
max = len(string) - len(endquote)
while pos < max:
opening_bracket_index = string.find("{", pos)
if opening_bracket_index == -1:
string_part = string[pos:max]
yield (FSTRING_MIDDLE, string_part, (lnum, pos), (lnum, max), line)
pos = max
else:
string_part = string[pos:opening_bracket_index]
yield (
FSTRING_MIDDLE,
string_part,
(lnum, pos),
(lnum, opening_bracket_index),
line,
)
yield (
LBRACE,
"{",
(lnum, opening_bracket_index),
(lnum, opening_bracket_index + 1),
line,
)
pos = opening_bracket_index + 1

# TODO: skip over {{
if pos < max:
inner_source = string[pos:max]
curly_brace_level = 1
startpos = pos
for token in generate_tokens(io.StringIO(inner_source).readline):
pos = startpos + token[3][1]

if token[0] == OP and token[1] == "{":
curly_brace_level += 1
elif token[0] == OP and token[1] == "}":
curly_brace_level -= 1

if curly_brace_level == 0:
yield (
RBRACE,
"}",
(lnum, pos),
(lnum, pos + 1),
line,
)
break

token_with_updated_pos = (
token[0],
token[1],
(token[2][0], startpos + token[2][1]),
(token[3][0], startpos + token[3][1]),
token[4],
)
yield token_with_updated_pos

yield (FSTRING_END, endquote, (lnum, max), endpos, line)


if __name__ == "__main__": # testing
if len(sys.argv) > 1:
tokenize(open(sys.argv[1]).readline)
else:
tokenize(sys.stdin.readline)
tusharsadhwani marked this conversation as resolved.
Show resolved Hide resolved