-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Implement ??= operator #3747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ??= operator #3747
Conversation
@nikic I'm not that amazed at the runtime checking of liveness ranges; is it not possible to simply store two live ranges for DUP_TMP? In the example above for V5 from L6 to L12 and from L16 to L17? Also, shouldn't it be named COPY_TMP? I associate DUP with ZVAL_DUP, which actually duplicates arrays. And COPY is for refcount only operations. |
That's an interesting idea. I suspect that we have assumptions that the opcode above the start of the live range defines the variable. But if this can be made to work then I would prefer it as well.
Yeah, makes sense. |
Not at runtime at least for |
Opcache has this assumption at least here: https://github.com/php/php-src/blob/master/ext/opcache/Optimizer/dce.c#L517 We could check for the case where the defining opline is a jump and assume it's a split range in that case. @dstogov Do you have any opinion on this? |
I would prefer to reject the idea, if it can't be implemented in a simple way. What was wrong with original implementation https://github.com/php/php-src/pull/1795/files ? May be we may suppress them using BP_VAR_W instead of BP_VAR_RW, or BP_VAR_RW_IS or BP_VAR_RW + some additional flag. |
@nikic Got it. Another idea - JMP_IF_NOT_NULL opcode that keeps source operand alive. L0 (2): INIT_FCALL_BY_NAME 0 string("foo") Unfortunately, this won't work with overloaded arrays and objects, but I'm not sure about their support in your implementation, as well. In general, this may be solved, introducing JMP_IF_NOT_NULL_DIM and others, that keep both arguments alive. |
@dstogov This would furthermore require |
@nikic you are right (INDIRECT pointers may become dangled). Then, I don't see an efficient implementation at all. |
@dstogov The engine itself doesn't get significantly more complicated if we just have multiple single live ranges for the same var. Just opcache needs handling of that... |
I have pushed a new version implementing @bwoebi's suggestion to represent this as two live-ranges. I'm not sure if this is better or not. I think ideally opcache would not deal with live-ranges and instead fully recompute them after optimization. But special cases like this make this hard. |
@nikic I'm wondering whether it's possible to emit live ranges as a separate step after compilation? We just need to maintain a list of non-consuming and jumping opcode operands then. Is there actually any reason / benefit why we are squeezing their generation into the main compilation step? |
@dstogov interactive mode is on a block-boundary. So should not be problematic here as we have no live ranges existing across a block (with block I mean like foreach, switch, ...). |
I'll give the separate live range computation a try (separately from this feature). |
@nikic great. ping me about result. |
759c8b3
to
fbb9879
Compare
Rebased over liveness changes. What I ended up doing here is add special handling for COPY_TMP during live range calculation and emit two live ranges for this case. It's not pretty, but I think this is okay as long as COPY_TMP is the only opcode with the live range splitting problem it's better to handle it as a special case than make the overall algorithm more complicated. |
Looks good to me, thanks :-) |
@nikic Unfortunately, I broke this PR by recent optimization commits, but form source review, I don't see any serious problems now. |
9a3b36a
to
a8909a8
Compare
Updated for the live range optimizations and merged as a50198d. |
ossfuzz has discovered a problem with conjunction with <?php
assert(y)[y] ??= y;
CC @iluuu1994 |
RFC: https://wiki.php.net/rfc/null_coalesce_equal_operator
The RFC has been accepted a long time ago, but never landed due to implementation issues. I believe I now have a correct and working implementation, though unfortunately not a simple one.
$a ??= $b
is the same as$a ?? ($a = $b)
with the caveat that$a
should only be evaluated once, insofar as this is possible. In particular in$a[foo()] ?? $b
we should only evaluatefoo()
once.To understand the implementation, it is instructive to look at the opcodes generated for this code snippet:
This gives us:
The notable parts are:
BP_VAR_IS
chain for the isset check. Any expressions used here are duplicated using aZEND_COPY_TMP
opcode. This opcode duplicates the passed value without destroying it.BP_VAR_W
chain for the assignment. Here we do not reevaluate expressions and instead use theCOPY_TMP
results.COPY_TMP
s.COPY_TMP
results are non-contiguous. There is a liveness hole between the use of the variable in the isset branch and theJMP
over the isset branch. Special code during liveness calculation handles this case and emits two separate live-ranges, so we can keep handling this as usual at runtime.Overall this makes the implementation rather complex and subtle for such a superficially simple feature, but I was not able to come up with a simpler solution that satisfies all implementation constraints.