New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mypyc] Introduce FormatOp and add a tokenizer for .format() call #10935
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning up the code! Left several minor comments, mostly about docstrings/missing functionality.
format_ops = [] | ||
for spec in specifiers: | ||
# TODO: Match specifiers instead of using whole_seq | ||
if spec.whole_seq == '%s' or spec.whole_seq == '{:{}}': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about %d
? Will it be supported in a follow-up PR?
It would be better to make this generic over different kinds of formatting instead of special casing percent sign formatting, etc. Do you plan to improve this in a follow-up PR (I assume that this is what the TODO comment is about)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
This PR adds a tokenizer that convert a str.format() format string into literals and specifiers. By doing so, the code structure of
translate_str_format
is clearer.This PR also introduces
FormatOp
. Compare toConversionSpecifier
,FormatOp
has fewer attributes and indicates compile time optimizations. For example, to mark a conversion from any object to string,ConversionSpecifier
may have several representations, like '%s', '{}' or '{:{}}'. However, there would only exist one correspondingFormatOp
.Currently
FormatOp
is just an Enum for convenience. We might add several attributes later and upgrade it to a class if we need to support more conversions.To help for the future optimization, these parts of code are extracted into new functions:
generate_format_ops
that shrinkConversionSpecifier
intoFormatOp
convert_expr
that can help convert the expressions into desired results.