New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix <op>= RHS/LHS eval order #992
Conversation
I fixed a related bug earlier related to
Because the RHS,
The existing (and incorrect) fix in the current code base is that for top level expressions the RHS is assumed not to have side effects on the LHS binding. I mistakenly assumed LHS would be evaluate when the operator is applied but the LHS is actually conceptually evaluated before RHS is ever looked at, and that looked up value is used for the final operation. WIth the mistaken assumption the top level shortcut would be safe. But with the correct evaluation order, a top level The peephole optimizator doesn't have enough state now to optimize the sequence later: JUMPs crossing the optimized instructions would need to be fixed too. |
Hmm, I guess one relatively simple approach would be:
In other words, if the RHS is in a plain constant/register form, and no code has been emitted, there cannot be side effects in the RHS and we can optimize away the temporary. The RHS can be a constant or anything that constant folds without emitting code and the optimization would kick in. |
Sounds good. It would be a shame to lose a performance optimization for something as common as assignment. |
Actually looking at the current code and comments, the case being optimized in the current implementation is the resulting ivalue for a For that case the top level assumption seems safe, i.e. if the assignment is at the top level, there's no outer expression to deal with and a temporary is unnecessary. The issue in this pull is actually different and is related to |
This will involve at least a few cups of coffee to keep the optimized form but fixing the bug. I'll be back later :) |
f2ccd99
to
aead63b
Compare
Ok, I think I got the fix and the optimization approach seems to be working. In particular, these come out as a single ADD opcode: var x = 10, y = 20;
x += 10; // constant
x += 10 * 400; // constant folded into a constant
x += y; // register bound variable, no code is emitted for evaluation so no side effects
x += 'foo' + 'bar' + 'quux'; // constant folded into a constant But for example this involves an unnecessary temporary: var x = 10, y = 20;
x += y + 1; // computation would be safe because y is register bound and side effect free |
I guess that's fair, detecting the second case reliably (in particular, without false positives) would presumably require an IR. |
Although... isn't the entire |
Since there's no IR that expression basically looks like What would be possible is tracking some sort of flags / state as ivalues are combined (for example, are they side effect free) and then detecting that when the RHS had been evaluated. |
Note that ivalues are either plain values or binary operations on two plain values. They are in effect miniature IR trees which have a fixed size, and as the compiler proceeds it "collapses" the hypothetical IR to the immediate ivalues at hand. An ivalue never references another ivalue - that would actually be an expression tree. The compiler document link I pointed to provides a few very concrete examples of this. |
|
77a051d
to
62a886d
Compare
I see, so it's something like recursive descent but at the expression level. That makes sense then why optimizations are so difficult. |
Well no, it's recursive descent at the statement level. But it's top-down operator parsing with ivalues-based code emission for expressions. The compiler.rst document provides a summary for this and links to the top-down operator parsing paper explaining that technique. |
Top-down operator parsing could also be used to construct a traditional IR and just compile from there. The ivalue based approach is a memory saving technique which essentially works with a hypothetical IR tree on-the-fly, with a tiny "window" into the IR tree represented by the most immediate ivalues at hand. Ivalues are then combined as we go on, each such combination triggering code emission, temporary register allocation, constant allocation, etc. |
I'll give |
The reason the compiler uses that technique by the way is that I originally just targeted ES5 and was looking for the most memory efficient parser which was still capable of doing simple constant folding and such. Top-down operator parsing turned out to be a handy approach for allocating temporaries in the right order, and the ivalue technique allows expressions to be parsed without needing to decide beforehand whether the expression will ultimately be RHS or LHS -- that decision can be done in the final step when e.g. a property access ivalue hasn't yet been forced into a GETPROP or a PUTPROP, in effect deciding its RHS/LHS role. For ES6 it's still going to be a challenge to remain memory efficient for the low memory targets. So I'm basically trying to look for a solution with some of these characteristics:
Overall ES6 will probably require at least a statement level IR, i.e. an IR tree constructed for each statement and then thrown away. Multiple passes would then be needed for hoisting variable declarations and such. Assuming single pass parsing and a full function IR is going to be a low memory challenge, but would otherwise make things much simpler. So there are some trade-offs involved. What I don't want to do is choose a new structure which is low memory hostile and then try to somehow make that work well for low memory targets. While premature optimization is not usually a good idea, choosing a structure which actively works against that is also very counterproductive. So ideally both low memory issues and compilation quality issues (optimizations) would be addressed simultaneously. |
One viable but boring option would be to leave low memory targets at ES5 with the current compiler (maintaining it as necessary for opcode format changes) but develop an ES6 compiler for non-low-memory targets. The ES6 compiler wouldn't then need to be as memory conservative which would make its structure easier to maintain. But overall maintaining two compilers would be awkward, and eventually they would have conflicting interests with respect to what bytecode and various other internals should look like. So, I'm very much trying to avoid this outcome ... :) |
Compilers are one of the most interesting parts in my opinion. Low memory scalable compilers especially :) The current compiler was written with a lot of trial-and-error (I think I rewrote it at least 2 times) and I didn't have many metrics back then. So with better metrics for footprint, performance, etc, it'll be much nicer to rewrite the compiler. Hopefully there'd be a solid month or two to work on it soon :) |
Optimization to avoid a temporary for x <op>= y works for any RHS which doesn't emit code when evaluated to an ivalue, e.g.: * A plain constant or any expression which constant folds to a constant, e.g.: x += 4 and x += 'foo' + 'bar'. * A register-bound variable, e.g. x += y. The optimization doesn't have enough state to detect safe cases such as register bound 'y' in: x += y + 1.
* A few RegExp issues have been resolved via ES6 RegExp syntax.
62a886d
to
49cd5c6
Compare
I thought this might make multiple chained compound assignments to the same variable idempotent, but it turns out that's not the case, either in Duktape or other engines. Huh. In any case the behavior seems to be correct now. Thanks for the quick fix. :-) |
What kind of compound assignment do you mean? |
By "compound assignment" I was referring to the Basically I assumed that |
Sorry my question was ambiguous, but yes I meant what kind of concrete idempotent expression were you thinking about. Yes, The semantics are not immediately obvious (and for me it'd be more natural if the LHS was evaluated after the RHS had potentially updated the LHS variable). Incidentally test262 test suite doesn't cover this. |
This seems to confuse everyone, not just us: :) |
Hm... so I was looking through the changelogs for old Duktape versions to get a sense for how far it's evolved since I started work on minisphere, and found a very similar bug to this was already fixed in v1.1.2, #118 (which it seems got opened as a result of my post on SphereDev, go figure :). How come that fix didn't automatically fix this bug too? |
That's the bug I was referring to above too (the one I fixed earlier). It's a different case. In this case it was about evaluating LHS before RHS for the OP in |
In
x <op>= y
the value ofx
(LHS) should be evaluated before RHS. This sometimes matters when chained<op>=
expressions are used, see #987 (comment).Tasks: