New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-46528: Attempt SWAP
s at compile-time
#30970
Conversation
1% faster:
|
Could we maybe factor out a helper function like static int
next_safe_store(basicblock *block, int i, int expected_lineno)
{
for (; i <= block->b_iused; i++) {
struct instr *instruction = &block->b_instr[i];
if (expected_lineno != -1 && instruction->i_lineno != expected_lineno) {
return -1;
}
switch (block->b_instr[i].i_opcode) {
case NOP:
continue;
case STORE_FAST:
case POP_TOP:
return i;
default:
return -1;
}
}
return -1;
} And then call it once to get the first stack-consumer, and then |
I'm assuming we don't need to worry that this is quadratic if there are a bunch of non-simplifiable swaps followed by a bunch of stores? Also, are cases like the following possible?
|
Yeah, I like that idea.
I might be missing something... isn't this only quadratic if we successfully fold lots of swaps? We move right-to-left, and bail on the first one that doesn't work. I'm okay with quadratic behavior here, since N is generally going to be very small (1 or 2 swaps of depth 2 or 3). Pattern matching code can be worse, but even then it's still rare to see anything worse than a handful of shallow-ish swaps.
I had given that some thought, but I wasn't able to create any code that compiled that way by hand. The long runs of Maybe there is a more general solution than this that can handle those cases, though. I just wouldn't want to add much more complexity to support it if we're not sure of the payoff. |
I think that's right -- I guess that would only not pay off if we have a very strange million-line source file that only gets run once or something. Probably not worth worrying about too much. |
Nah, it would need to be one really long line. 🙂 Something like this: def f(seq):
match seq:
case a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, _:
pass |
Can we do similar things for
or is that for a separate PR? |
No, we need to be pretty conservative with this. The order of attribute stores and fast local loads are user-visible. They can both raise exceptions, and attribute access can be customized with arbitrary Python code. |
(Maybe you could do something clever like only loading |
I think def f(x, y):
class A:
def __init__(self, value):
self.value = value
def __del__(self):
print(a, b)
def __repr__(self):
return f"<value={self.value}>"
a = A(333)
b = A(444)
a, b = x, y
f(1, 2) Before:
After:
I know the exact timing of when |
I think that @markshannon believes our ordering guarantees don't apply to finalizers. I'm not entirely convinced of that argument myself, though. I'll just leave |
@sweeneyde if this looks good to you, I'll go ahead and merge it today. (Congrats on your promotion, by the way!) |
I just looked over it again and it looks good to me (modulo merge conflicts). Thanks - I appreciate it! |
https://bugs.python.org/issue46528