-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve f-string implementation: FORMAT_VALUE opcode #69669
Comments
Currently, the f-string f'a{3!r:10}' evaluates to bytecode that does the same thing as: ''.join(['a', format(repr(3), '10')]) That is, it literally calls the functions format() and repr(). The same holds true for str() and ascii() with !s and !a, respectively. By redefining format, str, repr, and ascii, you can break or pervert the computation of the f-string's value: >>> def format(v, fmt=None): return '42'
...
>>> f'{3}'
'42' It's always been my intention to fix this. This patch adds an opcode FORMAT_VALUE, which instead of looking up format, etc., directly calls PyObject_Format, PyObject_Str, PyObject_Repr, and PyObject_ASCII. Thus, you can no longer modify what an f-string produces merely by overriding the named functions. In addition, because I'm now saving the name lookups and function calls, performance is improved. Here are the times without this patch: $ ./python -m timeit -s 'x="test"' 'f"{x}"'
1000000 loops, best of 3: 0.3 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
1000000 loops, best of 3: 0.511 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
1000000 loops, best of 3: 0.497 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
1000000 loops, best of 3: 0.461 usec per loop And with this patch: $ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
100000000 loops, best of 3: 0.02 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
10000000 loops, best of 3: 0.0896 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
10000000 loops, best of 3: 0.0923 usec per loop So a 90%+ speedup, for these simple cases. Also, now f-strings are faster than %-formatting, at least for some types: $ ./python -m timeit -s 'x="test"' '"%s"%x'
10000000 loops, best of 3: 0.0755 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop Note that people often "benchmark" %-formatting with code like the following. But the optimizer converts this to a constant string, so it's not a fair comparison: $ ./python -m timeit '"%s"%"test"'
100000000 loops, best of 3: 0.0161 usec per loop These microbenchmarks aren't the end of the story, since the string concatenation also takes some time. That's another optimization I might implement in the future. Thanks to Mark and Larry for some advice on this. |
Small cleanups. Fixed a bug in PyCompile_OpcodeStackEffect. |
This patch addresses Larry's review, plus bumps the bytecode magic number. |
Oops. Forgot to include the diff with that last message. But it turns out it doesn't work, anyway, because I put the #define's in opcode.h, which is generated (so my code got deleted!). I'll try to find some reasonable .h file to use and submit a new patch soon. |
I know this issue is slowly turning into "make Eric update outdated docs", but if you find that https://docs.python.org/devguide/compiler.html#introducing-new-bytecode is outdated, could you update that doc? |
It's Lib/opcode.py. |
Brett: I'll take a look. Serhiy: I'm looking for a place to put some #defines related to the bit masks and bit values that my FORMAT_VALUE opcode is using for opargs. One option is to just put them in Tools/scripts/generate_opcode_h.py, so that they end up in the generated opcode.h, but that seems a little sleazy. I can't find a better place they'd belong, though. Specifically, I want to put these lines into a .h file to use by ceval.c and compile.c: /* Masks and values for FORMAT_VALUE opcode. */
#define FVC_MASK 0x3
#define FVS_MASK 0x4
#define FVC_NONE 0x0
#define FVC_STR 0x1
#define FVC_REPR 0x2
#define FVC_ASCII 0x3
#define FVS_HAVE_SPEC 0x4 |
Does the dis module need these constants? If no, you can use either ceval.h or compile.h. |
Thanks, Serihy. I looked at those, and neither one is a great fit. But not having a better option, I went with ceval.h. Here's the patch. |
Some formatting improvements. I removed one of the optimizations I was doing, because it's also done in PyObject_Format(). I plan on moving other optimizations into PyObject_Format(), but I'll open a separate issue for that. I swapped the order of the parameters on the stack, so that I could use the micro-optimization of TOP() and SET_TOP(). I'll commit this shortly. |
It looks to me that FVS_MASK and FVS_HAVE_SPEC are duplicates. FVS_HAVE_SPEC is set but FVS_MASK is tested. |
Right, they're the same because it's a single bit. You 'and' with a mask to get the bits you want, and you 'or' together the values. It's an old habit left over from my bit-twiddling days. I guess the test could really be: have_fmt_spec = (oparg & FVS_MASK) == FVS_HAVE_SPEC; to make it more clear what I'm doing. It's easier to see the same thing with the FVC_MASK and FVC_* values, since that field is multiple bits. |
The MASK idiom is nice and I think it's good to be exposed to |
New changeset 1ddeb2e175df by Eric V. Smith in branch 'default': |
New changeset 4734713a31ed by Eric V. Smith in branch 'default': |
Brett: https://docs.python.org/devguide/compiler.html#introducing-new-bytecode looks correct (and reminded me to update dis.rst!). |
Just noticed a small typo: formattedd Also, needs |
New changeset 93fd7adbc7dd by Eric V. Smith in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: