You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the COMPARE_OP instruction performs one of four different tasks.
We should break it up into four different instructions, that each performs only one of those tasks.
The four tasks are:
Rich comparison (>, <, ==, !=, >=, <=)
Identity comparison (is, is not)
Contains test (in, not in)
The current implementation involves an unnecessary extra dispatch to determine which task to perform.
Comparisons are common operations, so this extra call and unpredictable branch has a cost.
In addition, testing for exception matching is always followed by a branch, so the test and branch can be combined.
I propose adding three new instructions and changing the meaning of COMPARE_OP.
COMPARE_OP should only perform rich comparisons, and should call PyObject_RichCompare directly.
IS_OP performs identity tests, performs no calls and cannot fail.
CONTAINS_OP tests for 'in and 'not in' and should call PySequence_Contains directly.
JUMP_IF_NOT_EXC_MATCH Tests whether the exception matches and jumps if it does not.
I think is a good idea, being the only problem that I see that as the opcode targets are limited we would be burning 3 more. I specially like the proposal for JUMP_IF_NOT_EXC_MATCH as PyCmp_EXC_MATCH is only used in the code for the try-except and is always followed by a POP_JUMP_IF_FALSE.
As a curiosity, it would be good to have an idea on performance gains of specializing the comparison.
Few years ago I experimented with a special opcode for exception matching. It could make the bytecode a tiny bit smaller and faster, but since catching an exception in Python is relatively expensive, it would not have significant performance benefit.
As for splitting COMPARE_OP for comparison, identity and containment tests, all these operations look the same from the compiler side, they correspond the same AST node. Introducing new opcodes will complicate the compiler.
Moving work from the interpreter to the compiler is always a good idea.
Performance: The compiler is run once per code unit, the interpreter thousands or millions of times.
The compiler is easier to test. Just match the expected bytecode with the actual bytecode.
The output can be sanity checked by visual inspection.
Although I expect a performance boost, I think this is a worthwhile improvement whether or not it helps
performance as it makes the instructions better focused.
Pablo, currently there are 117 opcodes, increasing that to 120 is not a problem.
Also there is no reason why we are limited to 256 opcodes in the long term.
Plus, I'm 4 opcodes in credit, thanks to https://bugs.python.org/issue33387 :)
Raymond, regarding the performance of COMPARE_OP, it is not just branch prediction that matters. With this change, the number of (hardware) instructions executed is always reduced, even if branch prediction is no better.
Serhiy, the benefit of having a special opcode for exception matching is not really to speed up exception matching, but to avoid slowing down other tests.