Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix <op>= RHS/LHS eval order #992

Merged
merged 5 commits into from Oct 5, 2016
Merged

Fix <op>= RHS/LHS eval order #992

merged 5 commits into from Oct 5, 2016

Conversation

svaarala
Copy link
Owner

@svaarala svaarala commented Oct 4, 2016

In x <op>= y the value of x (LHS) should be evaluated before RHS. This sometimes matters when chained <op>= expressions are used, see #987 (comment).

Tasks:

  • Add bug testcase
  • Extend bug testcase to cover all LHS cases (reg-bound variable, slow path variable, property, etc)
  • Slow but correct fix, check bytecode
  • Optimize for common cases if possible
  • Fix backtracking issue if temp reg load is shuffled
  • Run shuffle torture test
  • Releases entry

@svaarala svaarala added the bug label Oct 4, 2016
@svaarala svaarala added this to the v2.0.0 milestone Oct 4, 2016
@svaarala svaarala mentioned this pull request Oct 4, 2016
11 tasks
@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

I fixed a related bug earlier related to <op>= evaluation. The basic difficult is that with a simple implementation x += 4 generates:

LDREG temp, x
ADD x, temp, 4

Because the RHS, 4, is side effect free in this case the preferred output would be:

ADD x, x, 4

The existing (and incorrect) fix in the current code base is that for top level expressions the RHS is assumed not to have side effects on the LHS binding. I mistakenly assumed LHS would be evaluate when the operator is applied but the LHS is actually conceptually evaluated before RHS is ever looked at, and that looked up value is used for the final operation. WIth the mistaken assumption the top level shortcut would be safe.

But with the correct evaluation order, a top level <op>= is not safe to optimize unless one is certain RHS is side effect free. There's unfortunately no easy way to do that. I'll have to see if I can somehow manage that at least for plain constants, because generating that unnecessary load wouldn't be nice.

The peephole optimizator doesn't have enough state now to optimize the sequence later: JUMPs crossing the optimized instructions would need to be fixed too.

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

Hmm, I guess one relatively simple approach would be:

  • Emit the temporary load for the LHS.
  • Evaluate the RHS to an ivalue. No code is generated for e.g. a constant, the ivalue just represents the constant.
  • Coerce the ivalue to a simple value, i.e. a constant or register. This resolves ivalues like 1 + 2 or 1 + x.
  • Check if any code has been emitted since emitting the temporary load for LHS. If not, remove the temporary load and assume the RHS is side effect free.

In other words, if the RHS is in a plain constant/register form, and no code has been emitted, there cannot be side effects in the RHS and we can optimize away the temporary. The RHS can be a constant or anything that constant folds without emitting code and the optimization would kick in.

@fatcerberus
Copy link
Contributor

Sounds good. It would be a shame to lose a performance optimization for something as common as assignment.

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

Actually looking at the current code and comments, the case being optimized in the current implementation is the resulting ivalue for a x <op>= y expression. Conceptually the ivalue result must NOT be the x register binding but a fresh temporary because the x register value may change if the <op>= expression is inside an expression further mutating x.

For that case the top level assumption seems safe, i.e. if the assignment is at the top level, there's no outer expression to deal with and a temporary is unnecessary.

The issue in this pull is actually different and is related to <op>= expressions only, while the result ivalue issue affects also plain assignment.

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

This will involve at least a few cups of coffee to keep the optimized form but fixing the bug. I'll be back later :)

@svaarala svaarala force-pushed the fix-op-assign-eval-order branch 3 times, most recently from f2ccd99 to aead63b Compare October 4, 2016 20:54
@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

Ok, I think I got the fix and the optimization approach seems to be working. In particular, these come out as a single ADD opcode:

var x = 10, y = 20;

x += 10;  // constant
x += 10 * 400;  // constant folded into a constant
x += y;  // register bound variable, no code is emitted for evaluation so no side effects
x += 'foo' + 'bar' + 'quux';  // constant folded into a constant

But for example this involves an unnecessary temporary:

var x = 10, y = 20;

x += y + 1;  // computation would be safe because y is register bound and side effect free

@fatcerberus
Copy link
Contributor

I guess that's fair, detecting the second case reliably (in particular, without false positives) would presumably require an IR.

@fatcerberus
Copy link
Contributor

Although... isn't the entire x += y + 1 expression available as an ivalue while parsing it? So it should be possible, assuming there are no function calls or similar involved, to look at the expression and see if there are any side effects, no?

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

Since there's no IR that expression basically looks like x += <something> when parsing the += expression, and y + 1 when parsing the RHS. There isn't a point where the whole expression is visible to the compiler.

What would be possible is tracking some sort of flags / state as ivalues are combined (for example, are they side effect free) and then detecting that when the RHS had been evaluated.

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

Note that ivalues are either plain values or binary operations on two plain values. They are in effect miniature IR trees which have a fixed size, and as the compiler proceeds it "collapses" the hypothetical IR to the immediate ivalues at hand. An ivalue never references another ivalue - that would actually be an expression tree. The compiler document link I pointed to provides a few very concrete examples of this.

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

test-assign-add.js                  : duk.O2  3.73 duk.O2.master  3.80 duk.O2.150  5.29
test-assign-addto-nan.js            : duk.O2  1.13 duk.O2.master  1.15 duk.O2.150  1.57
test-assign-addto.js                : duk.O2  3.72 duk.O2.master  3.71 duk.O2.150  5.26
test-assign-boolean.js              : duk.O2  4.74 duk.O2.master  4.74 duk.O2.150  4.82
test-assign-const-int.js            : duk.O2  2.45 duk.O2.master  2.45 duk.O2.150  2.59
test-assign-const-int2.js           : duk.O2  4.68 duk.O2.master  4.67 duk.O2.150  9.07
test-assign-const.js                : duk.O2  3.64 duk.O2.master  3.64 duk.O2.150  4.26
test-assign-literal.js              : duk.O2  3.78 duk.O2.master  3.78 duk.O2.150  4.23
test-assign-proplhs-reg.js          : duk.O2  3.38 duk.O2.master  3.39 duk.O2.150  3.71
test-assign-proprhs.js              : duk.O2  3.66 duk.O2.master  3.64 duk.O2.150  4.19
test-assign-reg.js                  : duk.O2  2.79 duk.O2.master  2.79 duk.O2.150  2.82

@fatcerberus
Copy link
Contributor

I see, so it's something like recursive descent but at the expression level. That makes sense then why optimizations are so difficult.

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

Well no, it's recursive descent at the statement level. But it's top-down operator parsing with ivalues-based code emission for expressions. The compiler.rst document provides a summary for this and links to the top-down operator parsing paper explaining that technique.

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

Top-down operator parsing could also be used to construct a traditional IR and just compile from there. The ivalue based approach is a memory saving technique which essentially works with a hypothetical IR tree on-the-fly, with a tiny "window" into the IR tree represented by the most immediate ivalues at hand. Ivalues are then combined as we go on, each such combination triggering code emission, temporary register allocation, constant allocation, etc.

@fatcerberus
Copy link
Contributor

I'll give compiler.rst a good read when I have some more free time. Now that I've played with the compiler code I'm kind of intrigued. :-)

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

The reason the compiler uses that technique by the way is that I originally just targeted ES5 and was looking for the most memory efficient parser which was still capable of doing simple constant folding and such. Top-down operator parsing turned out to be a handy approach for allocating temporaries in the right order, and the ivalue technique allows expressions to be parsed without needing to decide beforehand whether the expression will ultimately be RHS or LHS -- that decision can be done in the final step when e.g. a property access ivalue hasn't yet been forced into a GETPROP or a PUTPROP, in effect deciding its RHS/LHS role.

For ES6 it's still going to be a challenge to remain memory efficient for the low memory targets. So I'm basically trying to look for a solution with some of these characteristics:

  • Allow enough IR to parse destructuring and other ES6 constructs.
  • Scale down to low memory targets and ES5 parsing. Not necessarily as memory efficient as the current compiler but should be quite close.
  • Scale up to ES6 parsing, and to allow optional optimization modules to be tacked on. This should be much more modular than the current compiler logic which is (quite intentionally) very tightly coupled. For example, tree walks doing constant folding, inlining known-to-be-safe constants, etc.

Overall ES6 will probably require at least a statement level IR, i.e. an IR tree constructed for each statement and then thrown away. Multiple passes would then be needed for hoisting variable declarations and such. Assuming single pass parsing and a full function IR is going to be a low memory challenge, but would otherwise make things much simpler. So there are some trade-offs involved.

What I don't want to do is choose a new structure which is low memory hostile and then try to somehow make that work well for low memory targets. While premature optimization is not usually a good idea, choosing a structure which actively works against that is also very counterproductive. So ideally both low memory issues and compilation quality issues (optimizations) would be addressed simultaneously.

@svaarala
Copy link
Owner Author

svaarala commented Oct 4, 2016

One viable but boring option would be to leave low memory targets at ES5 with the current compiler (maintaining it as necessary for opcode format changes) but develop an ES6 compiler for non-low-memory targets. The ES6 compiler wouldn't then need to be as memory conservative which would make its structure easier to maintain. But overall maintaining two compilers would be awkward, and eventually they would have conflicting interests with respect to what bytecode and various other internals should look like.

So, I'm very much trying to avoid this outcome ... :)

@svaarala
Copy link
Owner Author

svaarala commented Oct 5, 2016

Now that I've played with the compiler code I'm kind of intrigued. :-)

Compilers are one of the most interesting parts in my opinion. Low memory scalable compilers especially :)

The current compiler was written with a lot of trial-and-error (I think I rewrote it at least 2 times) and I didn't have many metrics back then. So with better metrics for footprint, performance, etc, it'll be much nicer to rewrite the compiler.

Hopefully there'd be a solid month or two to work on it soon :)

Optimization to avoid a temporary for x <op>= y works for any RHS
which doesn't emit code when evaluated to an ivalue, e.g.:

* A plain constant or any expression which constant folds to a
  constant, e.g.: x += 4 and x += 'foo' + 'bar'.

* A register-bound variable, e.g. x += y.

The optimization doesn't have enough state to detect safe cases
such as register bound 'y' in: x += y + 1.
* A few RegExp issues have been resolved via ES6 RegExp syntax.
@svaarala svaarala merged commit 740fc30 into master Oct 5, 2016
@svaarala svaarala deleted the fix-op-assign-eval-order branch October 5, 2016 02:01
@fatcerberus
Copy link
Contributor

I thought this might make multiple chained compound assignments to the same variable idempotent, but it turns out that's not the case, either in Duktape or other engines. Huh.

In any case the behavior seems to be correct now. Thanks for the quick fix. :-)

@svaarala
Copy link
Owner Author

svaarala commented Oct 5, 2016

What kind of compound assignment do you mean?

@fatcerberus
Copy link
Contributor

By "compound assignment" I was referring to the X <op>= Y construct, as opposed to "simple assignment", X = Y. Those are the terms I usually see for the two operations.

Basically I assumed that x += x += x += ... += 1 was idempotent regardless of how many += you chain, because I mistakenly assumed x wouldn't be updated at all until the semicolon. What actually happens is that it gets updated at each step.

@svaarala
Copy link
Owner Author

svaarala commented Oct 5, 2016

Sorry my question was ambiguous, but yes I meant what kind of concrete idempotent expression were you thinking about. Yes, x += x += 1 updates x on each step but on each step the LHS is evaluated first.

The semantics are not immediately obvious (and for me it'd be more natural if the LHS was evaluated after the RHS had potentially updated the LHS variable). Incidentally test262 test suite doesn't cover this.

@fatcerberus
Copy link
Contributor

This seems to confuse everyone, not just us:
http://stackoverflow.com/questions/13106754/chaining-compound-assignment-operators-in-javascript

:)

@fatcerberus
Copy link
Contributor

Hm... so I was looking through the changelogs for old Duktape versions to get a sense for how far it's evolved since I started work on minisphere, and found a very similar bug to this was already fixed in v1.1.2, #118 (which it seems got opened as a result of my post on SphereDev, go figure :). How come that fix didn't automatically fix this bug too?

@svaarala
Copy link
Owner Author

That's the bug I was referring to above too (the one I fixed earlier). It's a different case. In this case it was about evaluating LHS before RHS for the OP in <OP>=. In #118 it's about where the result of RHS evaluation is stored before continuing evaluation, especially of a sequence of assignments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants