Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] improve mangle #2219

Closed
wants to merge 1 commit into from
Closed

[WIP] improve mangle #2219

wants to merge 1 commit into from

Conversation

alexlamsl
Copy link
Collaborator

Assign shorter names to symbols with higher frequency of occurrence.

Assign shorter names to symbols with higher frequency of occurrence.
@alexlamsl
Copy link
Collaborator Author

This frequency-based rearrangement is confined at scope-level to preserve inter-scope mangle_names() execution order.

master #2219
https://code.jquery.com/jquery-3.2.1.js
- parse: 0.250s
- scope: 0.125s
- compress: 0.953s
- mangle: 0.172s
- properties: 0.000s
- output: 0.125s
- total: 1.625s

Original: 268039 bytes
Uglified: 86834 bytes
SHA1 sum: 503352031d907fd9aa67d4add98089435c32f039

https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.6.4/angular.js
- parse: 0.468s
- scope: 0.267s
- compress: 1.515s
- mangle: 0.187s
- properties: 0.000s
- output: 0.125s
- total: 2.562s

Original: 1249863 bytes
Uglified: 174359 bytes
SHA1 sum: e8672dee5536c6881d185b54d0c3aa8afd39e0a2

https://cdnjs.cloudflare.com/ajax/libs/mathjs/3.9.0/math.js
- parse: 0.953s
- scope: 0.437s
- compress: 3.406s
- mangle: 0.438s
- properties: 0.000s
- output: 0.328s
- total: 5.562s

Original: 1590107 bytes
Uglified: 468105 bytes
SHA1 sum: 9b76f60c1b8163d19b86a9a901cc1d7c88f94195

https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.js
- parse: 0.125s
- scope: 0.061s
- compress: 0.360s
- mangle: 0.047s
- properties: 0.000s
- output: 0.032s
- total: 0.625s

Original: 69707 bytes
Uglified: 36838 bytes
SHA1 sum: aea95bcc21c959627da6a379a41891372c5977c0

https://unpkg.com/react@15.3.2/dist/react.js
- parse: 0.484s
- scope: 0.266s
- compress: 1.703s
- mangle: 0.156s
- properties: 0.000s
- output: 0.125s
- total: 2.734s

Original: 701412 bytes
Uglified: 205920 bytes
SHA1 sum: 03c1812d33723bbd68af98aa411e9e77147f7ab7

http://builds.emberjs.com/tags/v2.11.0/ember.prod.js
- parse: 0.812s
- scope: 0.469s
- compress: 2.609s
- mangle: 0.438s
- properties: 0.000s
- output: 0.297s
- total: 4.625s

Original: 1852178 bytes
Uglified: 527037 bytes
SHA1 sum: 1b2564cfd645fbd95fbb4c1bde271ec56615af69

https://cdn.jsdelivr.net/lodash/4.17.4/lodash.js
- parse: 0.281s
- scope: 0.156s
- compress: 1.656s
- mangle: 0.110s
- properties: 0.000s
- output: 0.062s
- total: 2.265s

Original: 539590 bytes
Uglified: 70429 bytes
SHA1 sum: 4937c39669484feed26b3986622972b3aeebb37c

https://cdnjs.cloudflare.com/ajax/libs/d3/4.5.0/d3.js
- parse: 0.546s
- scope: 0.344s
- compress: 2.266s
- mangle: 0.250s
- properties: 0.000s
- output: 0.172s
- total: 3.578s

Original: 451131 bytes
Uglified: 212511 bytes
SHA1 sum: 07583fafb6c4da54872a71338930e94336f19cfe
https://code.jquery.com/jquery-3.2.1.js
- parse: 0.266s
- scope: 0.157s
- compress: 0.843s
- mangle: 0.140s
- properties: 0.000s
- output: 0.141s
- total: 1.562s

Original: 268039 bytes
Uglified: 85997 bytes
SHA1 sum: 2cd7e8ec23d2a9199fd5324ddde8424a3ca259e8

https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.6.4/angular.js
- parse: 0.484s
- scope: 0.296s
- compress: 1.813s
- mangle: 0.172s
- properties: 0.000s
- output: 0.235s
- total: 3.000s

Original: 1249863 bytes
Uglified: 173894 bytes
SHA1 sum: 43dc464d9578d9a6faf203b983d5fbc1a3256703

https://cdnjs.cloudflare.com/ajax/libs/mathjs/3.9.0/math.js
- parse: 0.890s
- scope: 0.438s
- compress: 3.797s
- mangle: 0.359s
- properties: 0.000s
- output: 0.328s
- total: 5.812s

Original: 1590107 bytes
Uglified: 467525 bytes
SHA1 sum: 30cdc3207d0bba143338e4392d41bf36e867cb7e

https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.js
- parse: 0.140s
- scope: 0.078s
- compress: 0.328s
- mangle: 0.047s
- properties: 0.000s
- output: 0.032s
- total: 0.625s

Original: 69707 bytes
Uglified: 36838 bytes
SHA1 sum: 67aa1007f96a45693cb6d4c4394104b1531364b4

https://unpkg.com/react@15.3.2/dist/react.js
- parse: 0.484s
- scope: 0.266s
- compress: 1.500s
- mangle: 0.234s
- properties: 0.000s
- output: 0.141s
- total: 2.625s

Original: 701412 bytes
Uglified: 205906 bytes
SHA1 sum: 2fbf3476fad28cf9d7d3b355c8ae10a6b864b382

http://builds.emberjs.com/tags/v2.11.0/ember.prod.js
- parse: 1.000s
- scope: 0.531s
- compress: 2.828s
- mangle: 0.344s
- properties: 0.000s
- output: 0.297s
- total: 5.000s

Original: 1852178 bytes
Uglified: 526942 bytes
SHA1 sum: 58a900ad09a243dfa59b68b8e42af3877e72920d

https://cdn.jsdelivr.net/lodash/4.17.4/lodash.js
- parse: 0.296s
- scope: 0.141s
- compress: 1.688s
- mangle: 0.187s
- properties: 0.000s
- output: 0.078s
- total: 2.390s

Original: 539590 bytes
Uglified: 70131 bytes
SHA1 sum: ab5d535d892c9cd137db8b0c5e283e368dff2654

https://cdnjs.cloudflare.com/ajax/libs/d3/4.5.0/d3.js
- parse: 0.562s
- scope: 0.313s
- compress: 2.390s
- mangle: 0.266s
- properties: 0.000s
- output: 0.187s
- total: 3.718s

Original: 451131 bytes
Uglified: 211614 bytes
SHA1 sum: 3a9db2bb284ae25ad94a10fadd7210cf4ef20dad

@alexlamsl
Copy link
Collaborator Author

So even after this PR, babili still produces a smaller minified d3.js than us (210,713 vs 211,614)

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

Nice PR! Across the board mangle size reduction for little to no extra uglify time.

I assume the symbol frequencies are computed after compress?

@alexlamsl
Copy link
Collaborator Author

So even after this PR, babili still produces a smaller minified d3.js than us (210,713 vs 211,614)

May be it's due to ES6...

$ uglifyjs -V
uglify-es 3.0.24

$ node test/benchmark.js --ecma 8 -mc

https://cdnjs.cloudflare.com/ajax/libs/d3/4.5.0/d3.js
- parse: 0.593s
- scope: 0.438s
- compress: 2.844s
- mangle: 0.312s
- properties: 0.000s
- output: 0.219s
- total: 4.406s

Original: 451131 bytes
Uglified: 202730 bytes
SHA1 sum: ba7cf790877f4329878987283325f7c002ea7d69

@alexlamsl
Copy link
Collaborator Author

I assume the symbol frequencies are computed after compress?

It's computed by AST_Toplevel.figure_out_scope():
https://github.com/mishoo/UglifyJS2/blob/bdeadffbf582b393dbc14a45b3e69ddf16f47690/lib/scope.js#L209
https://github.com/mishoo/UglifyJS2/blob/bdeadffbf582b393dbc14a45b3e69ddf16f47690/lib/scope.js#L297

which we do (again) right before mangle_names():
https://github.com/mishoo/UglifyJS2/blob/bdeadffbf582b393dbc14a45b3e69ddf16f47690/lib/minify.js#L136

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

May be it's due to ES6...

I had noticed that in uglify-es testing as well, but it's not the case for babili. You can pipe the output of babili d3.js through uglify-js to verify that it is ES5.

@alexlamsl
Copy link
Collaborator Author

I had noticed that in uglify-es testing as well, but it's not the case for babili. You can pipe the output of babili d3.js through uglify-js to verify that it is ES5.

Thanks for the information - I shall learn how to use babili once, produce that minified file, then investigate...

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

There's a big showstopper problem with this PR - although it produces smaller non-gzip output, gzip output is significantly larger:

master:

$ bin/uglifyjs d3.js -mc | wc -c
  212511

$ bin/uglifyjs d3.js -mc | gzip | wc -c
   71560

with this PR (#2219):

$ bin/uglifyjs d3.js -mc | wc -c
  211614

$ bin/uglifyjs d3.js -mc | gzip | wc -c
   73190

We can't use this PR in this state.

@alexlamsl alexlamsl changed the title improve mangle [WIP] improve mangle Jul 8, 2017
@alexlamsl
Copy link
Collaborator Author

There's a big showstopper problem with this PR - although it produces smaller non-gzip output, gzip output is significantly larger:

Oops - may be those c's are a good hint that I've screwed something up.

@alexlamsl
Copy link
Collaborator Author

So babili does this:

var ic = Math.E, ac = Math.tan, dc = Math.acos, oc = Math.asin, lc = Math.exp, rc = Math.atan, cc = Math.atan2, sc = Math.sin, uc = Math.cos, pc = Math.PI, _c = Math.min, fc = Math.round, hc = Math.pow, gc = Math.log, yc = Math.abs, xc = Math.floor, mc = Math.max, bc = Math.ceil, vc = Math.sqrt;

Which for d3.js would save a lot of bytes as they are used in a lot of places.

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

So babili does this:

var ic = Math.E, ac = Math.tan, dc = Math.acos, oc = Math.asin, lc = Math.exp, rc = Math.atan, cc = Math.atan2, sc = Math.sin, uc = Math.cos, pc = Math.PI, _c = Math.min, fc = Math.round, hc = Math.pow, gc = Math.log, yc = Math.abs, xc = Math.floor, mc = Math.max, bc = Math.ceil, vc = Math.sqrt;

It also explains why babili's gzip sizes are larger.

I would not be in favor of such an optimization in uglify.

}
});
a.sort(function(m, n) {
return n.references.length - m.references.length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think symbol references is updated in compress and would present the pre-compress value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

figure_out_scope() is called again after compress() and right before mangle_names(), as I mentioned in the code references in #2219 (comment)

... or have I missed something? 😅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed mentioning that within figure_out_scope(), init_scope_vars() is called on each AST_Scope, which resets all references to previous instances of SymbolDef:
https://github.com/mishoo/UglifyJS2/blob/94e5e00c0321b92aca1abf170c12a02d6c3275b5/lib/scope.js#L117
https://github.com/mishoo/UglifyJS2/blob/94e5e00c0321b92aca1abf170c12a02d6c3275b5/lib/scope.js#L262-L263

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...wait - my mistake - it's figure_out_scope() that determines all the references in the first place is it not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's figure_out_scope() that determines all the references in the first place is it not?

Yup - reduce_vars does rebuild some of that within reset_opt_flags(), but whenever figure_out_scope() is called everything is reset to a clean slate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+        a.sort(function(m, n) {
+            return n.references.length - m.references.length;

I vaguely recall seeing that mangle approach attempted a couple years back and abandoned due to gzip results. I could be mistaken.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, more the reason to try and get to the bottom of this - mysteries are there to be solved 😜

@alexlamsl
Copy link
Collaborator Author

So it's that Safari workaround which messes up the character frequency:
https://github.com/mishoo/UglifyJS2/blob/94e5e00c0321b92aca1abf170c12a02d6c3275b5/lib/scope.js#L346-L361

I'm soooo tempted to gate this behind the safari flag and default to false 👻

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

So it's that Safari workaround which messes up the character frequency

We can't change the default unfortunately - old safari browsers are still fairly commonly used in the wild. Can we gate it behind an option defaulting to true? That way it can be disabled.

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

What d3.js uglify -mc size do you get without the safari workaround?

@alexlamsl
Copy link
Collaborator Author

We can't change the default unfortunately - old safari browsers are still fairly commonly used in the wild. Can we gate it behind an option defaulting to true? That way it can be disabled.

If we are going to have that as default behaviour, then the only way that seems useful is for me to spend some effort and make this new mangle workaround that workaround's quirks.

Don't worry about my grumpiness earlier - just letting some steam off before getting back to code 😅

@alexlamsl
Copy link
Collaborator Author

What d3.js uglify -mc size do you get without the safari workaround?

Exact same size - so my discovery above is unrelated to this file's gzip size increase.

@alexlamsl
Copy link
Collaborator Author

You know what - let me add gzip to test/benchmark.js and we can come back to this PR after that.

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

Don't worry about my grumpiness earlier

I hadn't noticed. Correcting my faulty assumptions or understanding may be a bit humbling - but welcome.

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

I think sorting by frequency produces less-optimal results over the basic mangle algorithm because LZ77 maintains a sliding window. In the initial window it's a case of first-come/first-serve when assigning bit patterns to strings. It's a difficult problem to predict how gzip will compress something as the complete algorithm has to be run on a given data set to know for sure.

Also, functions are moved to the top with hoist_funs. The function parameters and vars tend to use the same highly sought after mangle names more frequently. If shorter length mangle names are assigned to non-local fixed symbols then that's a problem.

@alexlamsl
Copy link
Collaborator Author

Using master, looks like hoist_funs doesn't win often when it comes to gzipped size:

`-mc` `-mc hoist_funs=0`
https://code.jquery.com/jquery-3.2.1.js
- parse: 0.296s
- scope: 0.140s
- compress: 1.015s
- mangle: 0.109s
- properties: 0.000s
- output: 0.078s
- total: 1.638s

Original: 268039 bytes
Uglified: 86834 bytes
GZipped:  30398 bytes
SHA1 sum: ce1f26ed06d363c5e18436b49626a3f0499d7488

https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.6.4/angular.js
- parse: 0.451s
- scope: 0.344s
- compress: 1.703s
- mangle: 0.234s
- properties: 0.000s
- output: 0.125s
- total: 2.857s

Original: 1249863 bytes
Uglified: 174359 bytes
GZipped:  60441 bytes
SHA1 sum: 9f195dbd07c0e6e8938a9f877786efc5e4326cff

https://cdnjs.cloudflare.com/ajax/libs/mathjs/3.9.0/math.js
- parse: 0.826s
- scope: 0.468s
- compress: 4.688s
- mangle: 0.594s
- properties: 0.000s
- output: 0.328s
- total: 6.904s

Original: 1590107 bytes
Uglified: 468105 bytes
GZipped:  119157 bytes
SHA1 sum: 4cc6257feff454238dfca34a240845168a50099d

https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.js
- parse: 0.140s
- scope: 0.047s
- compress: 0.342s
- mangle: 0.062s
- properties: 0.000s
- output: 0.047s
- total: 0.638s

Original: 69707 bytes
Uglified: 36838 bytes
GZipped:  9683 bytes
SHA1 sum: 7799998e18bab3ef6303a25e2eea970aa70b2b0c

https://unpkg.com/react@15.3.2/dist/react.js
- parse: 0.497s
- scope: 0.188s
- compress: 1.484s
- mangle: 0.204s
- properties: 0.000s
- output: 0.234s
- total: 2.607s

Original: 701412 bytes
Uglified: 205920 bytes
GZipped:  62345 bytes
SHA1 sum: 813aca089612ed417e5b0f77bebf0f23cdbf9740

http://builds.emberjs.com/tags/v2.11.0/ember.prod.js
- parse: 0.919s
- scope: 0.657s
- compress: 2.875s
- mangle: 0.734s
- properties: 0.000s
- output: 0.453s
- total: 5.638s

Original: 1852178 bytes
Uglified: 527037 bytes
GZipped:  128876 bytes
SHA1 sum: b81bde14e6b36a05ac052f9e979f29a232bbbd43

https://cdn.jsdelivr.net/lodash/4.17.4/lodash.js
- parse: 0.250s
- scope: 0.134s
- compress: 1.707s
- mangle: 0.125s
- properties: 0.000s
- output: 0.078s
- total: 2.294s

Original: 539590 bytes
Uglified: 70429 bytes
GZipped:  24493 bytes
SHA1 sum: 9bbb3a60fe7a46c9ed54c5f0337cc3dcfffa6f7f

https://cdnjs.cloudflare.com/ajax/libs/d3/4.5.0/d3.js
- parse: 0.591s
- scope: 0.328s
- compress: 2.204s
- mangle: 0.312s
- properties: 0.000s
- output: 0.469s
- total: 3.904s

Original: 451131 bytes
Uglified: 212511 bytes
GZipped:  71302 bytes
SHA1 sum: 54c673fa340b5fed38c3f9c42b8494108af33c96
https://code.jquery.com/jquery-3.2.1.js
- parse: 0.312s
- scope: 0.172s
- compress: 1.094s
- mangle: 0.172s
- properties: 0.000s
- output: 0.140s
- total: 1.890s

Original: 268039 bytes
Uglified: 86209 bytes
GZipped:  30022 bytes
SHA1 sum: bfd16c9da21a8a83875c9cb5d86af5f8add05302

https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.6.4/angular.js
- parse: 0.546s
- scope: 0.297s
- compress: 1.797s
- mangle: 0.266s
- properties: 0.000s
- output: 0.265s
- total: 3.171s

Original: 1249863 bytes
Uglified: 174523 bytes
GZipped:  60143 bytes
SHA1 sum: 2901c876866363b6f6bd0c00b092eab3cbc3c819

https://cdnjs.cloudflare.com/ajax/libs/mathjs/3.9.0/math.js
- parse: 1.015s
- scope: 0.609s
- compress: 3.782s
- mangle: 0.375s
- properties: 0.000s
- output: 0.328s
- total: 6.109s

Original: 1590107 bytes
Uglified: 467672 bytes
GZipped:  118937 bytes
SHA1 sum: ba44d8e59bb8f11b5d1a2999614f416dcd52990b

https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.js
- parse: 0.125s
- scope: 0.061s
- compress: 0.407s
- mangle: 0.063s
- properties: 0.000s
- output: 0.047s
- total: 0.703s

Original: 69707 bytes
Uglified: 36839 bytes
GZipped:  9579 bytes
SHA1 sum: 2aa377fdbd84073ea41dc044d80250fbcc021138

https://unpkg.com/react@15.3.2/dist/react.js
- parse: 0.515s
- scope: 0.297s
- compress: 1.766s
- mangle: 0.312s
- properties: 0.000s
- output: 0.125s
- total: 3.015s

Original: 701412 bytes
Uglified: 206263 bytes
GZipped:  62331 bytes
SHA1 sum: 0c25557c46cc77a3e374e1d44bc40e9796f4e400

http://builds.emberjs.com/tags/v2.11.0/ember.prod.js
- parse: 1.000s
- scope: 0.671s
- compress: 2.750s
- mangle: 0.391s
- properties: 0.000s
- output: 0.281s
- total: 5.093s

Original: 1852178 bytes
Uglified: 528099 bytes
GZipped:  129221 bytes
SHA1 sum: b63c9a045f3b4c73bed5b25057e450e5ae5b1d84

https://cdn.jsdelivr.net/lodash/4.17.4/lodash.js
- parse: 0.281s
- scope: 0.234s
- compress: 1.844s
- mangle: 0.187s
- properties: 0.000s
- output: 0.125s
- total: 2.671s

Original: 539590 bytes
Uglified: 70995 bytes
GZipped:  24814 bytes
SHA1 sum: 7b255967e7f6a62e195d2e36631dcf100815b409

https://cdnjs.cloudflare.com/ajax/libs/d3/4.5.0/d3.js
- parse: 0.687s
- scope: 0.391s
- compress: 2.250s
- mangle: 0.250s
- properties: 0.000s
- output: 0.187s
- total: 3.765s

Original: 451131 bytes
Uglified: 212771 bytes
GZipped:  71059 bytes
SHA1 sum: 4dc0ac1627ed15312c0497bdd7787bcd217188c1

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

Using master, looks like hoist_funs doesn't win often when it comes to gzipped size

hoist_funs is not a gzip win - neutral to slight negative. But for non-gzip sizes it appears to be a win on average. jquery notably disables hoist_funs as it favors smaller gzip size.

Interesting dilemma. In a perfect world we'd like to optimize for both gzip and non-gzip sizes. But that's not likely. I think it's best to give the users the flexibility to choose their options.

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

@alexlamsl FYI - mangle sort option: 83a4ebf and #83

The sort option was removed (ignored actually) in #991 due to bugs seen in #877 and #990.

@alexlamsl
Copy link
Collaborator Author

In a perfect world we'd like to optimize for both gzip and non-gzip sizes.

That's what I'm aiming for, just need to understand all the intricacies first 😓

Would you mind reminding me why hoist_funs produce smaller uglified output?

@alexlamsl
Copy link
Collaborator Author

Almost tempted to have a script that does exhaustive search for the list of options that would bring minimum uglified/gzipped size for a given input.

@alexlamsl
Copy link
Collaborator Author

hoist_funs is not a gzip win - neutral to slight negative. But for non-gzip sizes it appears to be a win on average. jquery notably disables hoist_funs as it favors smaller gzip size.

They also disable loops and (surprisingly) unused - I wonder why.

@alexlamsl
Copy link
Collaborator Author

At least with jquery-3.2.1.js, gzipped output is smaller with hoist_funs & loops disabled, but not for unused=false.

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

This PR magically avoids those issues by only sorting within AST_Scope rather than globally.

We could reintroduce the mangle sort option defaulting to false. If for no other reason that mangle sort proved to be risky historically - but fuzzing was not available back then.

Would you mind reminding me why hoist_funs produce smaller uglified output?

I think it produces smaller non-gzip output because it generally assigns the most favorable short mangle variables to function parameters and local variables. If functions are hoisted the theory goes that parameters will be more likely to be assigned those favorable slots - especially with small functions with local variables. In non-gzip output, frequency is the most important factor. But in gzip output one must consider the effects of both relative order and frequency due to LZ building a sliding window.

They also disable loops and (surprisingly) unused - I wonder why.

Disabling loops I understand - sometimes it wins and sometimes it loses. But disablingunused is a complete mystery to me.

@alexlamsl
Copy link
Collaborator Author

loops seems to suffer similar gzip issue with hoist_funs:

`-mc` `-mc loops=0`
https://code.jquery.com/jquery-3.2.1.js
- parse: 0.250s
- scope: 0.156s
- compress: 1.015s
- mangle: 0.188s
- properties: 0.000s
- output: 0.094s
- total: 1.703s

Original: 268039 bytes
Uglified: 86834 bytes
GZipped:  30398 bytes
SHA1 sum: ce1f26ed06d363c5e18436b49626a3f0499d7488

https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.6.4/angular.js
- parse: 0.484s
- scope: 0.328s
- compress: 1.688s
- mangle: 0.187s
- properties: 0.000s
- output: 0.141s
- total: 2.828s

Original: 1249863 bytes
Uglified: 174359 bytes
GZipped:  60441 bytes
SHA1 sum: 9f195dbd07c0e6e8938a9f877786efc5e4326cff

https://cdnjs.cloudflare.com/ajax/libs/mathjs/3.9.0/math.js
- parse: 0.828s
- scope: 0.469s
- compress: 3.578s
- mangle: 0.437s
- properties: 0.000s
- output: 0.328s
- total: 5.640s

Original: 1590107 bytes
Uglified: 468105 bytes
GZipped:  119157 bytes
SHA1 sum: 4cc6257feff454238dfca34a240845168a50099d

https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.js
- parse: 0.156s
- scope: 0.047s
- compress: 0.375s
- mangle: 0.047s
- properties: 0.000s
- output: 0.046s
- total: 0.671s

Original: 69707 bytes
Uglified: 36838 bytes
GZipped:  9683 bytes
SHA1 sum: 7799998e18bab3ef6303a25e2eea970aa70b2b0c

https://unpkg.com/react@15.3.2/dist/react.js
- parse: 0.453s
- scope: 0.234s
- compress: 1.641s
- mangle: 0.250s
- properties: 0.000s
- output: 0.125s
- total: 2.703s

Original: 701412 bytes
Uglified: 205920 bytes
GZipped:  62345 bytes
SHA1 sum: 813aca089612ed417e5b0f77bebf0f23cdbf9740

http://builds.emberjs.com/tags/v2.11.0/ember.prod.js
- parse: 0.859s
- scope: 0.516s
- compress: 2.625s
- mangle: 0.343s
- properties: 0.000s
- output: 0.282s
- total: 4.625s

Original: 1852178 bytes
Uglified: 527037 bytes
GZipped:  128876 bytes
SHA1 sum: b81bde14e6b36a05ac052f9e979f29a232bbbd43

https://cdn.jsdelivr.net/lodash/4.17.4/lodash.js
- parse: 0.281s
- scope: 0.171s
- compress: 1.719s
- mangle: 0.172s
- properties: 0.000s
- output: 0.141s
- total: 2.484s

Original: 539590 bytes
Uglified: 70429 bytes
GZipped:  24493 bytes
SHA1 sum: 9bbb3a60fe7a46c9ed54c5f0337cc3dcfffa6f7f

https://cdnjs.cloudflare.com/ajax/libs/d3/4.5.0/d3.js
- parse: 0.640s
- scope: 0.344s
- compress: 2.312s
- mangle: 0.250s
- properties: 0.000s
- output: 0.188s
- total: 3.734s

Original: 451131 bytes
Uglified: 212511 bytes
GZipped:  71302 bytes
SHA1 sum: 54c673fa340b5fed38c3f9c42b8494108af33c96
https://code.jquery.com/jquery-3.2.1.js
- parse: 0.234s
- scope: 0.173s
- compress: 1.093s
- mangle: 0.171s
- properties: 0.000s
- output: 0.125s
- total: 1.796s

Original: 268039 bytes
Uglified: 86916 bytes
GZipped:  30362 bytes
SHA1 sum: 21b61b91e6d67d7fc089d4f55c6e1aff6a9e10ba

https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.6.4/angular.js
- parse: 0.500s
- scope: 0.297s
- compress: 1.687s
- mangle: 0.172s
- properties: 0.000s
- output: 0.125s
- total: 2.781s

Original: 1249863 bytes
Uglified: 174396 bytes
GZipped:  60431 bytes
SHA1 sum: 80261a11ddd3f5529931d3cabc61530bae7dbb92

https://cdnjs.cloudflare.com/ajax/libs/mathjs/3.9.0/math.js
- parse: 0.906s
- scope: 0.500s
- compress: 3.640s
- mangle: 0.438s
- properties: 0.000s
- output: 0.313s
- total: 5.797s

Original: 1590107 bytes
Uglified: 468234 bytes
GZipped:  119189 bytes
SHA1 sum: 4efb1b89cc7a6493ca6101c6c1f1d05c6ba236ab

https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.js
- parse: 0.109s
- scope: 0.046s
- compress: 0.313s
- mangle: 0.047s
- properties: 0.000s
- output: 0.031s
- total: 0.546s

Original: 69707 bytes
Uglified: 36838 bytes
GZipped:  9683 bytes
SHA1 sum: 7799998e18bab3ef6303a25e2eea970aa70b2b0c

https://unpkg.com/react@15.3.2/dist/react.js
- parse: 0.484s
- scope: 0.203s
- compress: 1.531s
- mangle: 0.157s
- properties: 0.000s
- output: 0.140s
- total: 2.515s

Original: 701412 bytes
Uglified: 205993 bytes
GZipped:  62329 bytes
SHA1 sum: f8389d82598b5e2686fd4db62b967db743bccb3a

http://builds.emberjs.com/tags/v2.11.0/ember.prod.js
- parse: 0.859s
- scope: 0.609s
- compress: 2.547s
- mangle: 0.406s
- properties: 0.000s
- output: 0.282s
- total: 4.703s

Original: 1852178 bytes
Uglified: 527143 bytes
GZipped:  128860 bytes
SHA1 sum: 3a85a58d8c5b64a891f0d56f14dd19456c357054

https://cdn.jsdelivr.net/lodash/4.17.4/lodash.js
- parse: 0.218s
- scope: 0.140s
- compress: 1.688s
- mangle: 0.110s
- properties: 0.000s
- output: 0.109s
- total: 2.265s

Original: 539590 bytes
Uglified: 70543 bytes
GZipped:  24415 bytes
SHA1 sum: 04d38faa401fed1bf247bf0dae9bf78005e58ca8

https://cdnjs.cloudflare.com/ajax/libs/d3/4.5.0/d3.js
- parse: 0.546s
- scope: 0.312s
- compress: 2.188s
- mangle: 0.235s
- properties: 0.000s
- output: 0.187s
- total: 3.468s

Original: 451131 bytes
Uglified: 212638 bytes
GZipped:  71258 bytes
SHA1 sum: 7688e0152afd34d520cd4a48b52e852becca7857

@alexlamsl
Copy link
Collaborator Author

We could reintroduce the mangle sort option defaulting to false. If for no other reason that mangle sort proved to be risky historically - but fuzzing was not available back then.

Now that I did a refresh course on Lempel-Ziv stuff, I think sorting by frequency is most likely the wrong approach if we care about gzipped output. Nothing in frequency accounts for locality.

What might work is if we turn the order of mangle() the other way round, i.e. mangle innermost AST_Scopes first instead of the current top-down scanning.

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

loops appears to be just a slight win for non-gzip output on average.

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

What might work is if we turn the order of mangle() the other way round, i.e. mangle innermost AST_Scopes first instead of the current top-down scanning.

That's certainly worth a try.

Or maybe use a combined weighting of locality and frequency. The locality weighting for a symbol could "decay" the further you are away from that symbol's last use.

@alexlamsl
Copy link
Collaborator Author

Seems like while ➡️ for isn't much of a win gzip-wise. Using the following patch:

--- a/lib/compress.js
+++ b/lib/compress.js
@@ -2746,7 +2746,7 @@ merge(Compressor.prototype, {
                 if (!has_loop_control) return self.body;
             }
         }
-        if (self instanceof AST_While) {
+        if (false && self instanceof AST_While) {
             return make_node(AST_For, self, self).optimize(compressor);
         }
         return self;
master patch
https://code.jquery.com/jquery-3.2.1.js
- parse: 0.234s
- scope: 0.172s
- compress: 0.922s
- mangle: 0.109s
- properties: 0.000s
- output: 0.078s
- total: 1.515s

Original: 268039 bytes
Uglified: 86834 bytes
GZipped:  30398 bytes
SHA1 sum: ce1f26ed06d363c5e18436b49626a3f0499d7488

https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.6.4/angular.js
- parse: 0.515s
- scope: 0.250s
- compress: 1.656s
- mangle: 0.188s
- properties: 0.000s
- output: 0.125s
- total: 2.734s

Original: 1249863 bytes
Uglified: 174359 bytes
GZipped:  60441 bytes
SHA1 sum: 9f195dbd07c0e6e8938a9f877786efc5e4326cff

https://cdnjs.cloudflare.com/ajax/libs/mathjs/3.9.0/math.js
- parse: 0.890s
- scope: 0.423s
- compress: 3.562s
- mangle: 0.359s
- properties: 0.000s
- output: 0.328s
- total: 5.562s

Original: 1590107 bytes
Uglified: 468105 bytes
GZipped:  119157 bytes
SHA1 sum: 4cc6257feff454238dfca34a240845168a50099d

https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.js
- parse: 0.125s
- scope: 0.077s
- compress: 0.360s
- mangle: 0.047s
- properties: 0.000s
- output: 0.047s
- total: 0.656s

Original: 69707 bytes
Uglified: 36838 bytes
GZipped:  9683 bytes
SHA1 sum: 7799998e18bab3ef6303a25e2eea970aa70b2b0c

https://unpkg.com/react@15.3.2/dist/react.js
- parse: 0.484s
- scope: 0.250s
- compress: 1.656s
- mangle: 0.156s
- properties: 0.000s
- output: 0.141s
- total: 2.687s

Original: 701412 bytes
Uglified: 205920 bytes
GZipped:  62345 bytes
SHA1 sum: 813aca089612ed417e5b0f77bebf0f23cdbf9740

http://builds.emberjs.com/tags/v2.11.0/ember.prod.js
- parse: 0.781s
- scope: 0.593s
- compress: 2.532s
- mangle: 0.359s
- properties: 0.000s
- output: 0.344s
- total: 4.609s

Original: 1852178 bytes
Uglified: 527037 bytes
GZipped:  128876 bytes
SHA1 sum: b81bde14e6b36a05ac052f9e979f29a232bbbd43

https://cdn.jsdelivr.net/lodash/4.17.4/lodash.js
- parse: 0.281s
- scope: 0.141s
- compress: 1.703s
- mangle: 0.093s
- properties: 0.000s
- output: 0.078s
- total: 2.296s

Original: 539590 bytes
Uglified: 70429 bytes
GZipped:  24493 bytes
SHA1 sum: 9bbb3a60fe7a46c9ed54c5f0337cc3dcfffa6f7f

https://cdnjs.cloudflare.com/ajax/libs/d3/4.5.0/d3.js
- parse: 0.578s
- scope: 0.312s
- compress: 2.281s
- mangle: 0.235s
- properties: 0.000s
- output: 0.187s
- total: 3.593s

Original: 451131 bytes
Uglified: 212511 bytes
GZipped:  71302 bytes
SHA1 sum: 54c673fa340b5fed38c3f9c42b8494108af33c96
https://code.jquery.com/jquery-3.2.1.js
- parse: 0.296s
- scope: 0.156s
- compress: 0.969s
- mangle: 0.094s
- properties: 0.000s
- output: 0.078s
- total: 1.593s

Original: 268039 bytes
Uglified: 86902 bytes
GZipped:  30369 bytes
SHA1 sum: edd9a97cec9369778578828a8eca48b995865843

https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.6.4/angular.js
- parse: 0.531s
- scope: 0.234s
- compress: 1.813s
- mangle: 0.172s
- properties: 0.000s
- output: 0.125s
- total: 2.875s

Original: 1249863 bytes
Uglified: 174391 bytes
GZipped:  60431 bytes
SHA1 sum: 113137561ac8dea1765f6c4e2710c2815b2980b9

https://cdnjs.cloudflare.com/ajax/libs/mathjs/3.9.0/math.js
- parse: 0.875s
- scope: 0.499s
- compress: 3.547s
- mangle: 0.454s
- properties: 0.000s
- output: 0.328s
- total: 5.703s

Original: 1590107 bytes
Uglified: 468194 bytes
GZipped:  119182 bytes
SHA1 sum: 0c9c859857b2eb4ee92ad23f7e750cbd8ed0575a

https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.js
- parse: 0.125s
- scope: 0.061s
- compress: 0.360s
- mangle: 0.047s
- properties: 0.000s
- output: 0.032s
- total: 0.625s

Original: 69707 bytes
Uglified: 36838 bytes
GZipped:  9683 bytes
SHA1 sum: 7799998e18bab3ef6303a25e2eea970aa70b2b0c

https://unpkg.com/react@15.3.2/dist/react.js
- parse: 0.468s
- scope: 0.235s
- compress: 1.453s
- mangle: 0.234s
- properties: 0.000s
- output: 0.125s
- total: 2.515s

Original: 701412 bytes
Uglified: 205945 bytes
GZipped:  62322 bytes
SHA1 sum: 955eb64b85724f3899e94889bf1867b98f2c92c9

http://builds.emberjs.com/tags/v2.11.0/ember.prod.js
- parse: 0.796s
- scope: 0.564s
- compress: 2.765s
- mangle: 0.437s
- properties: 0.000s
- output: 0.297s
- total: 4.859s

Original: 1852178 bytes
Uglified: 527114 bytes
GZipped:  128862 bytes
SHA1 sum: 8727bf1cbd7f95a3efd2ed48f5dde1a6dd69e627

https://cdn.jsdelivr.net/lodash/4.17.4/lodash.js
- parse: 0.250s
- scope: 0.156s
- compress: 1.750s
- mangle: 0.125s
- properties: 0.000s
- output: 0.078s
- total: 2.359s

Original: 539590 bytes
Uglified: 70543 bytes
GZipped:  24415 bytes
SHA1 sum: 04d38faa401fed1bf247bf0dae9bf78005e58ca8

https://cdnjs.cloudflare.com/ajax/libs/d3/4.5.0/d3.js
- parse: 0.625s
- scope: 0.390s
- compress: 2.266s
- mangle: 0.250s
- properties: 0.000s
- output: 0.172s
- total: 3.703s

Original: 451131 bytes
Uglified: 212631 bytes
GZipped:  71256 bytes
SHA1 sum: 526561e41a1a7ca8be16ca34f09f522c8de2044c

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

Seems like while ➡️ for isn't much of a win gzip-wise. Using the following patch

It looks like just statistical noise either way.

@kzc
Copy link
Contributor

kzc commented Jul 8, 2017

Nothing in frequency accounts for locality.

It's not just locality that's important to LZ - relative order matters too. Notice how mangle on master produces functions with generally the same parameters in the same order. Such recurring strings of parameters with the same names and ordering can be assigned a shorter bit sequence.

@alexlamsl
Copy link
Collaborator Author

Tried bottom-up instead of top-down mangling, but doesn't win all the time in test/benchmark.js, so I think it's time I declare a strategic retreat.

@alexlamsl alexlamsl closed this Jul 11, 2017
@alexlamsl alexlamsl deleted the mango branch July 11, 2017 17:47
@kzc
Copy link
Contributor

kzc commented Jul 11, 2017

@alexlamsl That's unfortunate. It's tough to improve on the present mangle algorithm with respect to reducing gzip size - a few have tried over the years - including me. There's probably too many factors in play other than variable names to come up with a more optimal solution other than (grossly inefficient) brute force remangling.

@alexlamsl
Copy link
Collaborator Author

@kzc what I discovered so far is that disturbing the sequence of mangled names as minimal as possible produces the smallest gzip size, even without frequency assignment. Obviously, however, the nature of cross-scope variable usage and assignment makes this incredibly difficult.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants