Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: compound function call unsafe to combine #1631

Closed
pvdz opened this issue Mar 22, 2017 · 68 comments · Fixed by #1634
Closed

Bug: compound function call unsafe to combine #1631

pvdz opened this issue Mar 22, 2017 · 68 comments · Fixed by #1634
Labels

Comments

@pvdz
Copy link
Contributor

pvdz commented Mar 22, 2017

  • Bug report or feature request?
    bug

  • uglify-js version (uglifyjs -V)
    uglify-js 2.8.14

  • JavaScript input - ideally as small as possible.

var pc = 0;

function f(x) {
  pc = 200;
  return 100;
}

function x() {
  var t = f();
  pc += t;
  return pc;
}

console.log(x()); // 300

(code updated)

  • The uglifyjs CLI command executed or minify() options used.
  • An example of JavaScript output produced and/or the error or warning.
function f(x){return pc=200,100}function x(){return pc+=f()}var pc=0;console.log(x()); // 100

Got bitten by this bug today because there was something asserting the variable (y above) which was stripped from dist build. So when minifying that build it would make that x += f() which broke the build.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 22, 2017

Details: https://qfox.nl/weblog/404

@alexlamsl
Copy link
Collaborator

Looks like ES6: https://github.com/mishoo/UglifyJS2#uglifyjs-2

Note: release versions of uglify-js only support ECMAScript 5 (ES5). If you wish to minify ES2015+ (ES6+) code then please use the harmony development branch.

With the stated version of uglify-js and the example input, I get:

Parse error at ..\test.js:1,4
let x = 10;
    ^
SyntaxError: Unexpected token: name (x)
   at stack.get (Function code:87:17)

@pvdz
Copy link
Contributor Author

pvdz commented Mar 22, 2017

Ah excuse me, the problem is not related to let.

@alexlamsl Please re-open this.

The problem is the compound operator. I'll update the original description not to use let (used it out of habit)

@alexlamsl
Copy link
Collaborator

With the updated example I get:

function f(){return x=20,5}var x=10,y=f();x+=y;

which gives me x === 25 - so I still don't see where the issue is.

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

Cannot reproduce with default compress options:

$ cat 1631.js 

    var x = 10;
    function f() {
      x = 20;
      return 5;
    }
    var y = f(); // x = 20, y = 5
    x += y; // "20 + 5", so x = 25
$ bin/uglifyjs -V
uglify-js 2.8.14
$ bin/uglifyjs 1631.js -c 
function f(){return x=20,5}var x=10,y=f();x+=y;

Please provide an example that will fail from the command line as per example above.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 22, 2017

Fair enough. The case was burried in a large auto-generated build file. Here's the concise test case that breaks:

$ cat z
var pc = 0;
function f(x) {
  pc = 200;
  return 100;
}
function x() {
  var t = f();
  pc += t;
  return pc;
}
console.log(x()); // 300
$ node_modules/.bin/uglifyjs --version
uglify-js 2.8.14
$ node_modules/.bin/uglifyjs z -c
WARN: Collapsing variable t [z:9,8]
function f(x){return pc=200,100}function x(){return pc+=f()}var pc=0;console.log(x());

The problem being pc += f() breaking the case outlined earlier.

function f(x){return pc=200,100}function x(){return pc+=f()}var pc=0;console.log(x()); // -> 100

@alexlamsl
Copy link
Collaborator

Thank you for the working example.

$ node test.js
300

$ uglify-js test.js -c | node
WARN: Collapsing variable t [test.js:8,8]
100

$ uglify-js test.js -c collapse_vars=0 | node
300

@alexlamsl alexlamsl reopened this Mar 22, 2017
@alexlamsl
Copy link
Collaborator

@kzc so I think this is related to our discovery back in #1620 (comment), and in this case collapse_vars will need to take compound assignment as effectively using the AST_SymbolRef twice?

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

I have a fix. Some collapse_vars tests involving assignment will be non-optimal as result.

--- a/lib/compress.js
+++ b/lib/compress.js
@@ -614,6 +614,7 @@ merge(Compressor.prototype, {
                                 || node instanceof AST_IterationStatement
                                 || (parent instanceof AST_If          && node !== parent.condition)
                                 || (parent instanceof AST_Conditional && node !== parent.condition)
+                                || (node instanceof AST_Assign && node.operator.length > 1)
                                 || (parent instanceof AST_Binary
                                     && (parent.operator == "&&" || parent.operator == "||")
                                     && node === parent.right)

Edit: patch was revised.

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

Maybe fix can be refined if the assignment RHS does not have side effects.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 22, 2017

Trying to construct the test case I noticed that the case was already handled properly in general. It was only when it got wrapped in a local function that it broke.

Unfortunately I'm not up to speed with uglify internals to be of much more help.

@alexlamsl
Copy link
Collaborator

Found a similar but different failure mode which #1631 (comment) does not cover:

// test.js
var a = 0, b = 1;
function f() {
  a = 2;
  return 4;
}
function g() {
  var t = f();
  b = a + t;
  return b;
}
console.log(g());

gives:

$ node test.js
6

$ uglify-js test.js -c | node
WARN: Collapsing variable t [test.js:8,10]
4

$ uglify-js test.js -c collapse_vars=0 | node
6

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

@alexlamsl Oh boy. That's a problem that is not easily solved. In the expression b = a + t; the variable a is treated as side-effect-free, which it is. However f() has a dependency on a beforehand.

Tracing into each function for possible side-effects is not tenable - full program flow analysis would be required. This could be a show-stopper for collapse_vars.

Perhaps a side-effect-free variable used in the statement (such as a above) cannot be skipped over if it is not in the same scope as the statement under analysis.

@alexlamsl
Copy link
Collaborator

Perhaps a side-effect-free variable used in the statement (such as a above) cannot be skipped over if it is not in the same scope as the statement under analysis.

That is probably true, just as with the case of undeclared variables (though for a different reason).

@alexlamsl alexlamsl added the bug label Mar 22, 2017
@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

This is a more general fix that should handle both cases:

--- a/lib/compress.js
+++ b/lib/compress.js
@@ -614,6 +614,7 @@ merge(Compressor.prototype, {
                                 || node instanceof AST_IterationStatement
                                 || (parent instanceof AST_If          && node !== parent.condition)
                                 || (parent instanceof AST_Conditional && node !== parent.condition)
+                                || (node instanceof AST_SymbolRef && node.definition().scope !== self)
                                 || (parent instanceof AST_Binary
                                     && (parent.operator == "&&" || parent.operator == "||")
                                     && node === parent.right)

@alexlamsl PTAL

@alexlamsl
Copy link
Collaborator

Bad news 😅

// test.js
function g() {
  var a = 0, b = 1;
  function f() {
    a = 2;
    return 4;
  }
  var t = f();
  b = a + t;
  return b;
}
console.log(g());

gives:

$ node test.js
6

$ uglify-js test.js -c reduce_vars=0 | node
WARN: Collapsing variable t [test.js:8,10]
WARN: Dropping unused variable b [test.js:2,13]
4

But as illustrated, reduce_vars seems to cover this one up (though may not be the case in general)

$ uglify-js test.js -c | node
WARN: Dropping unused function f [test.js:3,11]
WARN: Dropping unused variable b [test.js:2,13]
6

$ uglify-js test.js -c
WARN: Dropping unused function f [test.js:3,11]
WARN: Dropping unused variable b [test.js:2,13]
function g(){var a=0,t=function(){return a=2,4}();return a+t}console.log(g());

@pvdz
Copy link
Contributor Author

pvdz commented Mar 22, 2017

if it is not in the same scope as the statement under analysis.

Why would scope make a difference here?

function g() {
  var a = 0, b = 1;
  function f() {
    a = 2;
    return 4;
  }
  var t = f();
  b = a + t;
  return b;
}
console.log(g());

Same scope, why would this not have the same problem?

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

Why would scope make a difference here?

A function call can only change the value of a var in different scope.

@alexlamsl
Copy link
Collaborator

It would have worked, I think, if collapse_vars scans inside the inner function.

The external function can never alter the inner/same-scope variable, that's true. But the inner function could - but so could the TreeWalker/TreeTransformer be able to reach within and scan for usage.

@alexlamsl
Copy link
Collaborator

@kzc #1631 (comment) is tested with #1631 (comment) applied.

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

It would have worked, I think, if collapse_vars scans inside the inner function.

It would never end - everything within it would have to scanned as well, ad nauseam.

@alexlamsl
Copy link
Collaborator

It would never end - everything within it would have to scanned as well, ad nauseam.

... except that inner function is already being scanned, which is why a couldn't be collapsed in b = a + t;

Can we reuse that information somehow?

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

as illustrated, reduce_vars seems to cover this one up (though may not be the case in general)

I see that but how exactly did reduce_vars prevent the problem in this case?

$ bin/uglifyjs test.js
function g(){var a=0,b=1;function f(){a=2;return 4}var t=f();b=a+t;return b}console.log(g());
$ bin/uglifyjs test.js -c collapse_vars=0
WARN: Dropping unused function f [test.js:3,11]
WARN: Dropping unused variable b [test.js:2,13]
function g(){var a=0,t=function(){return a=2,4}();return a+t}console.log(g());

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

except that inner function is already being scanned

Every function called within it would have to be scanned as well. It's non-trivial at that point.

@alexlamsl
Copy link
Collaborator

I see that but how exactly did reduce_vars prevent the problem in this case?

It optimises function f(){...} var t = f(); into var t = function(){...};, which seems to then suppresses collapse_vars from acting on t.

@alexlamsl
Copy link
Collaborator

Every function called within it would have to be scanned as well. It's non-trivial at that point.

Only if they are also function defined within this scope. Any outer functions won't be able to access the inner variable in question.

And any functions within this scope would have been scanned anyway, so no need to trace and re-scan?

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

The t in statement var a=0,t=function(){return a=2,4}();return a+t; won't be collapsed due to this specific condition:

https://github.com/mishoo/UglifyJS2/blob/ee95c1b38bcf0fbb6c676e98540c1d33f669e936/lib/compress.js#L610

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

Nice.

If you're good with the newly revised patch, I'm good with it.

@alexlamsl
Copy link
Collaborator

Two mocha failures:

// actual
function foo(o){var n=2*o;print("Foo:",n)}var print=console.log.bind(console);
// expected
function foo(o){print("Foo:",2*o)}var print=console.log.bind(console);
// actual
var print=console.log.bind(console),a=function(n){return 3*n}(3),b=function(n){return n/2}(12);print("qux",a,b),function(n){var o=2*n;print("Foo:",o)}(11);
// expected
var print=console.log.bind(console);print("qux",function(n){return 3*n}(3),function(n){return n/2}(12)),function(n){print("Foo:",2*n)}(11);

@alexlamsl
Copy link
Collaborator

Ah, they have the same issue - var print; blocking inner functions from collapsing their variables.

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

We can live with that I think.

@alexlamsl
Copy link
Collaborator

... and because we can't tell var n=2*o from say var n=o() apart, we can't optimise any further.

So yeah, let's accept this sub-optimal case for now.

@kzc
Copy link
Contributor

kzc commented Mar 22, 2017

@alexlamsl Thanks for your help.

@alexlamsl
Copy link
Collaborator

@kzc any time! 😎

@pvdz
Copy link
Contributor Author

pvdz commented Mar 22, 2017

Great work. Thanks!

@pvdz
Copy link
Contributor Author

pvdz commented Mar 23, 2017

I've been thinking about the collapse var general case. You don't know what happens underwater with observable (and affectable) side effects. I think you can't really deal with this. Consider this contrived example:

function f() {
  var a = 'fail';

  function g() {
    a = 'passe';
    return 's';
  }

  var o = {
    toString: g,
  };

  var t = '' + o;
  a += t;

  console.log(a);
}
f(); // "passes"
$ node_modules/.bin/uglifyjs --version
uglify-js 2.8.14
$ node_modules/.bin/uglifyjs z -c
WARN: Collapsing variable t [z:15,7]
WARN: Collapsing variable o [z:14,15]
function f(){function g(){return a="passe","s"}var a="fail";a+=""+{toString:g},console.log(a)}f();
// "fails"

(Side note: that could have safely been console.log(a +=""+{toString:g}) but perhaps slightly too complex for analysis and too much golfing)

While the inlining of the object is nice, the point is that the assignment of the function may happen sneakily. The string concat may look innocent but it may not be. I'm not sure you could ever statically detect this case.

There might be a generic workaround by rewriting a += '' + o to a = a + '' + o instead.

On that note, here's another case that you may want to consider;

function f() {
  var a = 'fail';

  function g() {
    a = 'passe';
    return 's';
  }

  var o = {
    toString: g,
  };

  var t = '' + o;
  a = a + t;

  console.log(a);
}
f();
$ node_modules/.bin/uglifyjs z -c
WARN: Collapsing variable t [z:15,10]
WARN: Collapsing variable o [z:14,15]
function f(){function g(){return a="passe","s"}var a="fail";a+=""+{toString:g},console.log(a)}f();

to be precise; it rewrites the explicit a = a + .. to a += ... which could lead to the same problem as above. This may be considered a different bug with the same outcome.

The toString assignment here is contrived but could easily happen in an abstracted function, a class constructor, or whatever. And dynamic so you can't possibly detect it at parse time.

Now var x = <expr>; y += x; to y += x could be rewritten to y = y + <expr> which is still shorter than the original but safer in terms of the compound assignment problem in this ticket.

Initially I thought there were counter examples to that rewrite but right now I can't think of any. All examples I can think of actually rely on the compound assignment "caching" the value like x++ does.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 23, 2017

And #1631 (comment) is going to be an even worse problem for ES6+ code.

@alexlamsl
Copy link
Collaborator

Your example works on the latest version of uglify-js:

$ node test.js
passes

$ uglify-js --version
uglify-js 2.8.15

$ uglify-js test.js -c
function f(){function g(){return a="passe","s"}var a="fail",o={toString:g},t=""+o;a+=t,console.log(a)}f();

$ uglify-js test.js -c | node
passes

@pvdz
Copy link
Contributor Author

pvdz commented Mar 23, 2017

That's great (I mean that) but do you think it holds in the generic case? This is contrived and the artifact is pretty obvious and only contained in the same scope. Does it actually taint and track tainted variables to conclude x + y is a problem not to be collapsed? There are so many outs that I'm not expecting this to be the case...

@alexlamsl
Copy link
Collaborator

If you are asking for a formal mathematical proof on the level of how Regular Expression is computable and correct, no I haven't come up with one yet.

I think we are taking the approach of incremental progress, which means if you can find a corner case which uglify-js generates bad code, please file an issue and I'd be delighted to investigate and fix the problem.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 23, 2017

I'm just worried that somewhere a developer is going to cry because his code magically breaks due to a minifier being too eager. However, I can't judge on how safe the heuristics are for applying the collapse-var rewrite and I think you can. So while you don't have to proof something about regular expressions, you could at least let me know what you think about the generic case outlined.

Right now all I see you do is check the explicit test case and bail, like happened when I opened this ticket. It still led to exposing a clear bug.

I'm not asking for sugar. But could you at least give me your thoughts about why this is or isn't going to be a problem for uglifyjs? Rather than just copy-paste-run-shrug, again. There's many syntactical cues that could cause the heuristics in uglifyjs to decide to bail on the collapse. Just like the initial example in this ticket needed some more work before actually exposing a bug. In this case it could be the object literal or the literal appearance of toString.

So please, if you tell me why you think this generic case won't be a problem for uglifyjs, I could be convinced to dig deeper into building a test case that it fails on. I do have some related background on the matter.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 23, 2017

FWIW, did some quick testing with 2.8.15 and couldn't produce a failing case. So that's good!

@alexlamsl
Copy link
Collaborator

Thanks for the testing. I'm afraid the best I can do is point you to the code, as it is the layout of which we tackle each optimisation.

A good place to start is perhaps understanding what TreeWalker and TreeTransformer do.

@kzc
Copy link
Contributor

kzc commented Mar 23, 2017

@qfox This is a volunteer effort. No one is paid to work on Uglify. We work on it because we find it to be interesting.

Although we try to ensure code correctness with extensive unit tests while balancing minification size and speed, no results are guaranteed. Every new release can potentially inadvertantly introduce a new bug while fixing an old one despite our best efforts. Users are encouraged to lock a version of uglify-js that they've tested against using yarn or npm shrinkwrap. Users are also free to disable minification options if there is an issue. People are welcome to submit bug reports and contribute fixes.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 23, 2017

@kzc Well that wasn't necessary. I think you got me wrong there. I'm not getting paid to help out either. But that also means that I don't have the time to dig deep into uglifyjs code and heuristics.

Somebody familiar with the code should be able to point out easily whether or not and why a certain problem would or wouldn't be a problem. That's all I was asking. My offered help here is just as voluntary, sans the credits. Which I don't need.

@kzc
Copy link
Contributor

kzc commented Mar 23, 2017

@qfox No slight intended. Just offering my perspective on the thankless job of helping to maintain open source projects in general.

Any help is welcome - and indeed your bug report brought to light a flaw in collapse_vars.

We could certainly beef up testing. In another post you mentioned you have some experience in code fuzzing. We were hoping to build a fuzz test that executes original and minified code and compares the results. There's already some basic fuzzing in test/mozilla-ast.js just for AST trees, but it does not execute code.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 23, 2017

So, it's old, but here's the fuzzer for my ES5 parser: http://qfox.github.io/zeparser2/test/tokenfuzzer.html

It concats random tokens then runs it through eval and through the parser to check whether or not they both pass or fail. It uncovered some obscure bugs in my parser, but also a few super obscure ones in browsers. Good times.

You can do a similar approach for minification testing. Create random code, include the emergency brake boilerplate I mentioned for loops, check the outcome, and match that against minified outcome. Rinse and repeat.

I know OS can be hard and the reward is knowing the software is used in millions of setups.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 23, 2017

Btw I would recommend writing a fuzzer from scratch. It's quite fun and rewarding, especially once it starts finding its first bugs. They're surprisingly easy to write, especially once you're at the level of maintaining a minifier anyways. Seeing a fuzzer run is just magic :)

@kzc
Copy link
Contributor

kzc commented Mar 23, 2017

@qfox That's good advice. Thank you.

As @alexlamsl mentioned this minifier doesn't have formal proofs to back it up, just heuristics we think are safe and generally produce smaller code. A fuzzer that executes code would certainly help.

@pvdz
Copy link
Contributor Author

pvdz commented Mar 23, 2017

@kzc here's a setup: https://github.com/qfox/uglyfuzzer

So far it didn't find anything, and it's relatively slow. I think something that just calls uglifyjs from within a nodejs script would be much faster. But whatever, I just whipped this up quickly. Should get you going if you want to build on that. If you need more help let me know (would suggest a new ticket for it).

@kzc
Copy link
Contributor

kzc commented Mar 23, 2017

here's a setup: https://github.com/qfox/uglyfuzzer

@qfox - Great stuff. Clearly it's already paying off as evidenced by #1639

If you look at test/run-tests.js you'll see a way to run code in isolation in the same process. Should make it 100 times faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants