[mypyc] Introduce FormatOp and add a tokenizer for .format() call #10935

97littleleaf11 · 2021-08-05T05:46:29Z

Description

This PR adds a tokenizer that convert a str.format() format string into literals and specifiers. By doing so, the code structure of translate_str_format is clearer.

This PR also introduces FormatOp. Compare to ConversionSpecifier, FormatOp has fewer attributes and indicates compile time optimizations. For example, to mark a conversion from any object to string, ConversionSpecifier may have several representations, like '%s', '{}' or '{:{}}'. However, there would only exist one corresponding FormatOp.

Currently FormatOp is just an Enum for convenience. We might add several attributes later and upgrade it to a class if we need to support more conversions.

To help for the future optimization, these parts of code are extracted into new functions:

generate_format_ops that shrink ConversionSpecifier into FormatOp
convert_expr that can help convert the expressions into desired results.

…ormat-call

JukkaL

Thanks for cleaning up the code! Left several minor comments, mostly about docstrings/missing functionality.

mypyc/irbuild/format_str_tokenizer.py

JukkaL · 2021-08-06T10:25:46Z

mypyc/irbuild/format_str_tokenizer.py

+    format_ops = []
+    for spec in specifiers:
+        # TODO: Match specifiers instead of using whole_seq
+        if spec.whole_seq == '%s' or spec.whole_seq == '{:{}}':


What about %d? Will it be supported in a follow-up PR?

It would be better to make this generic over different kinds of formatting instead of special casing percent sign formatting, etc. Do you plan to improve this in a follow-up PR (I assume that this is what the TODO comment is about)?

mypyc/irbuild/format_str_tokenizer.py

JukkaL

LGTM

97littleleaf11 added 7 commits August 3, 2021 18:16

Refactor

b58ccd8

Rename

6470086

Merge branch 'master' of https://github.com/python/mypy into modify-f…

d09ce59

…ormat-call

Add FormatOp

2ac7f4c

Revert is_valid

58c9381

Add generate_format_ops

2f22544

Fix

478576c

97littleleaf11 changed the title ~~[mypyc] Tokenizer for format call~~ [mypyc] Introduce FormatOp and add a tokenizer for .format() call Aug 5, 2021

97littleleaf11 marked this pull request as ready for review August 5, 2021 13:39

97littleleaf11 added 2 commits August 6, 2021 01:19

Merge branch 'master' of https://github.com/python/mypy into modify-f…

28f4365

…ormat-call

Delete unused imports

980a9e6

JukkaL reviewed Aug 6, 2021

View reviewed changes

97littleleaf11 added 2 commits August 6, 2021 20:01

Set generate_format_op as internal function

0ad827f

Add docstring

705202e

JukkaL approved these changes Aug 6, 2021

View reviewed changes

JukkaL merged commit 58c0a05 into python:master Aug 6, 2021

97littleleaf11 deleted the modify-format-call branch August 7, 2021 09:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mypyc] Introduce FormatOp and add a tokenizer for .format() call #10935

[mypyc] Introduce FormatOp and add a tokenizer for .format() call #10935

97littleleaf11 commented Aug 5, 2021 •

edited

JukkaL left a comment

JukkaL Aug 6, 2021

97littleleaf11 Aug 6, 2021

JukkaL left a comment

[mypyc] Introduce FormatOp and add a tokenizer for .format() call #10935

[mypyc] Introduce FormatOp and add a tokenizer for .format() call #10935

Conversation

97littleleaf11 commented Aug 5, 2021 • edited

Description

JukkaL left a comment

Choose a reason for hiding this comment

JukkaL Aug 6, 2021

Choose a reason for hiding this comment

97littleleaf11 Aug 6, 2021

Choose a reason for hiding this comment

JukkaL left a comment

Choose a reason for hiding this comment

97littleleaf11 commented Aug 5, 2021 •

edited