Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement function inlining #2053

Merged
merged 1 commit into from
Jun 5, 2017
Merged

implement function inlining #2053

merged 1 commit into from
Jun 5, 2017

Conversation

alexlamsl
Copy link
Collaborator

Enhance single-use function substitution.

fixes #281

@alexlamsl alexlamsl changed the title implement function inlining [WIPimplement function inlining Jun 4, 2017
@alexlamsl alexlamsl changed the title [WIPimplement function inlining [WIP] implement function inlining Jun 4, 2017
@alexlamsl alexlamsl changed the title [WIP] implement function inlining implement function inlining Jun 4, 2017
@alexlamsl
Copy link
Collaborator Author

alexlamsl commented Jun 4, 2017

test/benchmark.js

Size 540220b #2053
jquery 86851 86796
angular 174584 174337
math 468972 468060
bootstrap 36859 36839
react 206276 205934
ember 528045 526995
lodash 71387 70419
d3 213887 212518
Time 540220b #2053
jquery 1.343 1.406
angular 2.375 2.515
math 10.879 6.203
bootstrap 0.500 0.500
react 2.281 2.406
ember 4.500 4.843
lodash 1.484 1.593
d3 3.296 3.443

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

Was curious about timings, so I looked at a couple of files...

Without this PR:

$ /usr/bin/time bin/uglifyjs math.js -cm | wc -c
       12.23 real        12.89 user         0.17 sys
  468972

$ /usr/bin/time bin/uglifyjs ember.prod.js -cm | wc -c
        5.05 real         5.68 user         0.12 sys
  528045

With this PR:

$ /usr/bin/time bin/uglifyjs math.js -cm | wc -c
        6.43 real         7.26 user         0.15 sys
  468060

$ /usr/bin/time bin/uglifyjs ember.prod.js -cm | wc -c
        5.24 real         5.89 user         0.13 sys
  526995

So math.js is not only smaller, it runs in half the time. Any idea why?

math.js was an outlier - most other inputs ran with the same time or slightly slower.

@alexlamsl
Copy link
Collaborator Author

My quick diff over math.js with uglifyjs -cb bracketize shows a metric ton of function factory(...) {...} being inlined, and they seem to be generated by webpack.

I'm currently scratching my head as to where that speed-up comes from, as it might be indicative of a bug.

@alexlamsl
Copy link
Collaborator Author

@kzc do you happen to know if there are other examples of large-ish codebases that goes through webpack? May be that would help with the investigation.

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

I don't know any off-hand, but most large projects use webpack these days. So they shouldn't be too hard to find.

If this optimization inlines an integral part of webpack packaging then it could be a big win.

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

math.js aside, this PR slows uglifyjs -mc timings by 4% on average. In all the case I've tried it produces smaller output.

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

@alexlamsl Have you tried node 8 yet? Do you know if that node 7 speed regression was addressed with the official release?

@alexlamsl
Copy link
Collaborator Author

Have you tried node 8 yet? Do you know if that node 7 speed regression was addressed with the official release?

I've eradicated 7 from my systems as soon as 8.0.0 hits, but I haven't got round to evaluating its performance yet.

@alexlamsl
Copy link
Collaborator Author

Unfortunately the factory pattern is math.js-specific:

https://github.com/josdejong/mathjs/blob/8e8e02bec98cf862e62dc2ebc23fc07b359815e1/lib/core/function/config.js#L124

Just finished 🚿 and brewed ☕, so here we go...

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

Could you summarize the criteria for inlined functions in this PR? Looking at the patch it appears to only consider functions with a single return statement as the body. So it's an extension of the existing single-return function inlining taking non-constant expressions and parameters into account?

Might you consider extending this optimization to non-returning single line functions like the following?

$ echo 'function foo(x) { console.log(x); } foo(1);' | bin/uglifyjs -c toplevel
!function(x){console.log(x)}(1);
$ echo 'function foo(x) { return console.log(x), void 0; } foo(1);' | bin/uglifyjs -c toplevel
console.log(1);

This pattern is very common. Non returning functions Functions without return statements have an implicit return of undefined, as you know. Functions with a throw statement would have to be excluded, and perhaps others.

@alexlamsl
Copy link
Collaborator Author

The speed-up is pretty much within reduce_vars:

$ uglifyjs math.js --timings -c reduce_vars=0 -b bracketize -o min.js
- parse: 0.562s
- scope: 0.281s
- compress: 8.313s
- mangle: 0.000s
- properties: 0.000s
- output: 0.516s
- total: 9.672s

$ uglifyjs math.js --timings -c reduce_vars=1 -b bracketize -o min.js
- parse: 0.515s
- scope: 0.266s
- compress: 3.852s
- mangle: 0.000s
- properties: 0.000s
- output: 0.484s
- total: 5.117s

@alexlamsl
Copy link
Collaborator Author

Could you summarize the criteria for inlined functions in this PR? Looking at the patch it appears to only consider functions with a single return statement in the body.

That and the function is nameless - the rationale being that there can be no inner variable/function defintions from the function body which may otherwise pollute the parent scope.

Might you consider extending this optimization to non-returning single line functions like the following?

AST_Return ensures simple substituion, e.g. if you consider a variant of your example as console.log(function(){console.log(x)}); which we'll need to insert undefined at the end.

I was planning to address general function inlining in a future PR, though now that you've mentioned it, I guess it's no harm making this PR also do:

  • single, simple statement
  • convert to sequence of statement body + void 0

That way it's not so different from the existing AST_Return case, and we rely on sequences etc. to satisfy the condition about and ensure no definitions or control flows to invalid our assumption.

In short, thanks for the suggestion 😉 👍

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

Even if inlining functions could be extended to single-statement function bodies containing just AST_Call or AST_Sequence, that would handle most cases. It would avoid the troublesome cases involving variable declarations and throw.

@alexlamsl
Copy link
Collaborator Author

@kzc to avoid any confusion from my babbling above - yes it's a great idea, I'm working on it as we speak 👻

@alexlamsl
Copy link
Collaborator Author

@kzc added support for single AST_SimpleStatement inlining - have to update one of your mocha tests for /*#__PURE__*/ to workaround the new aggressiveness.

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

Looks good.

$ echo 'var o={}; function foo(x){x.p = 1; x.q = 2} foo(o); console.log(o);' | bin/uglifyjs -c toplevel,passes=3
var o={};o.p=1,o.q=2,console.log(o);
$ echo 'function foo(x){console.log(x)} function bar(y){foo(y)} bar(2);' | bin/uglifyjs -c toplevel,passes=3
console.log(2);

@alexlamsl
Copy link
Collaborator Author

test/jetstream.js caught something - investigating.

@alexlamsl alexlamsl changed the title implement function inlining [WIP] implement function inlining Jun 4, 2017
@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

Probably worth fuzzing this PR for a few million cycles as well. Although I don't know if ufuzz.js creates any optimization candidates for this PR - would have to confirm it does, and if not, extend it.

@alexlamsl
Copy link
Collaborator Author

It should, as other optimisations would reduce some of the generated functions to a single statement, most likely of AST_Sequence.

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

Yeah, I found such a case:

                {
                    var foo_2 = function f10() {
                        function f11(a_1) {
                            c = 1 + c, "number" >> /[a2][^e]+$/ <= ("number" ^ 5) > (true >= -3 > ("bar" ^ Infinity));
                            c = 1 + c, (-4 >= "foo") % ([] != "object") > (-3, 38..toString()) + {} % undefined;
                        }
                        var b_1 = f11();
                    }(--b + (typeof b !== "symbol"));
                }

bin/uglifyjs -c produces:

var foo_2=function(){c=1+c,c=1+c,38..toString()}(--b);

bin/uglifyjs -c passes=2 produces:

var foo_2=(c=1+c,c=1+c,void 38..toString());

Edit: note the bug above in the removal of the --b side effect.

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

The math.js -mc timing decrease largely went away with this latest PR - from 6s to 10s now.

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

@alexlamsl The result in #2053 (comment) shows a bug in the final transform regarding the removal of --b.

@alexlamsl
Copy link
Collaborator Author

The math.js -mc timing decrease largely went away with this latest PR - from 6s to 10s now.

Grrr... not sure if want. Oh well.

I think I've missed checking AST_Call.args for side-effects - currently that might have caused some execution re-ordering which may be why test/jetstream.js is failing.

@kzc
Copy link
Contributor

kzc commented Jun 4, 2017

missed checking AST_Call.args for side-effects

Exactly what was seen in test case above.

@kzc
Copy link
Contributor

kzc commented Jun 5, 2017

True enough. I was just excited by the prospect of using that flag for the first time.

@alexlamsl
Copy link
Collaborator Author

I was hoping I don't need to change any code as well - running with aforementioned modification now.

@alexlamsl
Copy link
Collaborator Author

Yup, that change made test/jetstream.js passes.

Now, to laser focus:

--- a/lib/output.js
+++ b/lib/output.js
@@ -593,6 +593,7 @@ function OutputStream(options) {
     // a function expression needs parens around it when it's provably
     // the first token to appear in a statement.
     PARENS(AST_Function, function(output){
+        if (output.parent() instanceof AST_Dot && output.parent().property == "prototype") return true;
         if (first_in_statement(output)) {
             return true;
         }

@alexlamsl
Copy link
Collaborator Author

Confirmed - this is all there is between pass and fail:

--- a/good.js
+++ b/bad.js
@@ -32938,12 +32938,12 @@ var PDFJS = {};
             },
             isStream: !0
         }, Stream;
-    }(), DecodeStream = ((function(str) {
+    }(), DecodeStream = (function(str) {
         for (var length = str.length, bytes = new Uint8Array(length), n = 0; length > n; ++n) {
             bytes[n] = str.charCodeAt(n);
         }
         Stream.call(this, bytes);
-    }).prototype = Stream.prototype, function() {
+    }.prototype = Stream.prototype, function() {
         function DecodeStream() {
             this.pos = 0, this.bufferLength = 0, this.eof = !1, this.buffer = null;
         }

@kzc
Copy link
Contributor

kzc commented Jun 5, 2017

Nice detective work.

Are you sure it's not a general ECMAScript issue? Did you test on other browsers?

Even if it's a WebKit bug I'd be surprised if it was specific to the prototype property. I'd suggest using:

     PARENS(AST_Function, function(output){
+        var p = output.parent();
+        if (p instanceof AST_PropAccess && p.expression === this) return true;

Accessing properties directly on a function expression is so rare in practise (because it's useless) that we might always emit the parens in that case.

Edit: removed an incorrect Closure example that differed from the issue in question.

In a future PR maybe we can drop a property being set directly on a function expression as part of unused leaving the RHS side effects.

@kzc
Copy link
Contributor

kzc commented Jun 5, 2017

Another data point - Babel 5 adds the parens:

$ echo 'var a = function(){}.prop = {};' | babel
"use strict";
var a = (function () {}).prop = {};
$ babel --version
5.8.29 (babel-core 5.8.38)

Babel 6 does not add the parens.

@alexlamsl
Copy link
Collaborator Author

Let me get home to test this on the full range of web browsers - I only know Node.js has no trouble with lack of parenthesis thus far.

As for side_effects elimination of function(){}.foo = bar, I believe it falls under the broader category of <constant>.prop = value which isn't being optimised at the moment. If we were to do that, do you think we should guard this with pure_getters as well?

@alexlamsl
Copy link
Collaborator Author

Are you sure it's not a general ECMAScript issue?

<sarcasm>I don't have $400B to gain access to the required specification.</sarcasm>

@alexlamsl
Copy link
Collaborator Author

Minimal test case:

(function() {1 + 1}.a = 1);
platform result
IE 11 1
Firefox 43 1
Opera 12 1
Safari 5 SyntaxError: Expected token ')'
Safari 10 1
Chrome 49 1
Chrome 58 1
Node.js 0.10 ✔️
Node.js 4 ✔️
Node.js 8 ✔️
Chakracore ✔️

@alexlamsl
Copy link
Collaborator Author

Tried phantomjs 2.5.0-beta - no dice.

@kzc
Copy link
Contributor

kzc commented Jun 5, 2017

(function() {1 + 1}.a = 1);
platform result
Safari 9 1
jxcore SpiderMonkey v34 1

So phantomjs-prebuilt@2.1.14 uses a version of WebKit before Safari 9.

@alexlamsl Are you inclined to have uglify emit the unnecessary parens behind an output option ("webkit" perhaps) defaulting to true or false? I think phantomjs is an important enough target to warrant enabling the parens by default for this rare pattern.

As for side_effects elimination of function(){}.foo = bar, I believe it falls under the broader category of <constant>.prop = value which isn't being optimised at the moment. If we were to do that, do you think we should guard this with pure_getters as well?

A function expression is no more constant than is a RegExp literal. I think side_effects is more appropriate, but pure_getters="strict" also works. When an AST_Function instance ref count is exactly 1, I think (function(){}).prop should be side effect free on the LHS, and the pattern (function(){}).prop = value should also be side effect free. Compound assignment operators in expressions like (function(){}).prop += value should not be considered side effect free.

But then again, this particular optimization could be full of snakes for little gain. Maybe best to avoid it.

@alexlamsl
Copy link
Collaborator Author

Are you inclined to have uglify emit the unnecessary parens behind an output option ("webkit" perhaps) defaulting to true or false? I think phantomjs is an important enough target to warrant enabling the parens by default for this rare pattern.

webkit flag sounds good, I'd prefer false by default alongside ie8 (& safari10 on harmony)

A function expression is no more constant than is a RegExp literal.

Indeed - I was thinking of side_effects and pure_getters="strict".

If somebody is evil enough to define setter on Function.prototype.prop we'll be screwed.

Anyway, let's leave that for another PR and focus on getting test/jetstream.js back on its feet first. I'll get a separate PR for webkit, then rebase this PR to move forward.

@alexlamsl
Copy link
Collaborator Author

@kzc #2053 (comment) should be addressed by 71acf65

@alexlamsl
Copy link
Collaborator Author

Rebased after #2056

@kzc anything yet to be done with this PR? Otherwise I'll merge if Travis ✅

@alexlamsl alexlamsl changed the title [WIP] implement function inlining implement function inlining Jun 5, 2017
@kzc
Copy link
Contributor

kzc commented Jun 5, 2017

Could you please add a #__PURE__ test to prevent future possible regressions with inline?

#2053 (comment)

@alexlamsl
Copy link
Collaborator Author

There is a mocha test which covers that:
https://github.com/mishoo/UglifyJS2/blob/e667f0acb87484617f506b561ecf0497ef9bdf25/test/mocha/minify.js#L173-L182

But I'll make one as you suggested inside test/compress/issue-281.js, if not because it's easier to read over there.

- empty body
- single `AST_Return`
- single `AST_SimpleStatement`
- avoid `/*#__PURE__*/`

Miscellaneous
- enhance single-use function substitution

fixes mishoo#281
@alexlamsl
Copy link
Collaborator Author

/*@__PURE__*/ test added in 853a660#diff-d8c7273ae91398df62e2497f4a7c4443R422

@kzc
Copy link
Contributor

kzc commented Jun 5, 2017

LGTM - well done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: Inline functions
2 participants