Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ES6] UglifyJS.AST_Node.from_mozilla_ast(acorn ast) compatibility issues #968

Closed
rwaldron opened this issue Feb 11, 2016 · 29 comments · Fixed by #4870
Closed

[ES6] UglifyJS.AST_Node.from_mozilla_ast(acorn ast) compatibility issues #968

rwaldron opened this issue Feb 11, 2016 · 29 comments · Fixed by #4870

Comments

@rwaldron
Copy link

var acorn = require('acorn');
var uglify = require('uglify-js');
var source = 'x => x;';
var ast = uglify.AST_Node.from_mozilla_ast(
  acorn.parse(source, {
    ecmaVersion: 6,
  })
);

This will result in a thrown exception:

TypeError: MOZ_TO_ME[node.type] is not a function

So I added in some logging to see which node.type failed:

Node {
  type: 'ArrowFunctionExpression',
  start: 0,
  end: 6,
  id: null,
  generator: false,
  expression: true,
  params: [ Node { type: 'Identifier', start: 0, end: 1, name: 'x' } ],
  body: Node { type: 'Identifier', start: 5, end: 6, name: 'x' } }

From there I went to see if the Parser API was up to date, which it is, but the naming is different:

interface ArrowExpression <: Function, Expression {
    type: "ArrowExpression";
    params: [ Pattern ];
    defaults: [ Expression ];
    rest: Identifier | null;
    body: BlockStatement | Expression;
    generator: boolean;
    expression: boolean;
}
@kzc
Copy link
Contributor

kzc commented Feb 11, 2016

ArrowFunctionExpression
ArrowExpression

Either way, that's an ES6 construct. The release version of uglify-js only supports ES5. You could always run your code through babel and import it into the release version of uglify.

The uglify harmony branch may be what you're looking for, but I'm not sure about the completeness of its mozilla AST (a.k.a. ESTree) import:

https://github.com/mishoo/UglifyJS2/tree/harmony

@rwaldron
Copy link
Author

Either way, that's an ES6 construct.

Yes, I'm very familiar, that was the point of this bug report: there is no documentation—not even a mention in the readme that the example I gave won't work.

You could always run your code through babel and import it into the release version of uglify.

Yes, many people have suggested this, but it's not an acceptable solution. The code is intended to execute in a Node.js runtime which runs on a limited resource, remote SoC. There is benefit to "compressing" the source code size before deploying it to the remote board. There is no benefit to running that same code through babel, then uglify. Let me illustrate:

Given:

(source.js)

function List() {
  Array.from(arguments, (arg, i) => this[i] = arg);
  this.length = arguments.length;
}

var l = new List(1, 2, 3, 4);

First, through babel:
(post-babel.js)

"use strict";

function List() {
  var _this = this;

  Array.from(arguments, function (arg, i) {
    return _this[i] = arg;
  });
  this.length = arguments.length;
}

var l = new List(1, 2, 3, 4);

Then through uglify:
(post-babel-uglify.js)

"use strict"
function List(){var t=this
Array.from(arguments,function(n,r){return t[r]=n}),this.length=arguments.length}var l=new List(1,2,3,4)

It's bigger than it started out—but really only because of the inclusion of "use strict":

$ tree -h
.
├── [ 144]  post-babel-uglify.js
├── [ 198]  post-babel.js
└── [ 137]  source.js

So, if I remove the "use strict":

$ tree -h
.
├── [ 131]  post-babel-uglify.js
├── [ 183]  post-babel.js
└── [ 137]  source.js

Big "win" of 6 whole bytes.

To make my point further, if I manually applied the same minification to my original example:

function List(){Array.from(arguments,(n,i)=>this[i]=n)
this.length=arguments.length}var l=new List(1,2,3,4)

...I've shaved off 29 bytes:

$ tree -h
.
├── [ 131]  post-babel-uglify.js
├── [ 183]  post-babel.js
├── [ 108]  post-manual-minification.js
└── [ 137]  source.js

Of course, this is a trivial example—you can imagine the benefits being much greater for larger, more diverse and complex code bases. Imagine what will happen when rest parameter syntax is available:

function List(...items) {
  Array.from(items, (arg, i) => this[i] = arg);
  this.length = items.length;
}

var l = new List(1, 2, 3, 4);

Will balloon up to:

function List() {
  var _this = this;

  for (var _len = arguments.length, items = Array(_len), _key = 0; _key < _len; _key++) {
    items[_key] = arguments[_key];
  }

  Array.from(items, function (arg, i) {
    return _this[i] = arg;
  });
  this.length = items.length;
}

var l = new List(1, 2, 3, 4);

And uglify to:

function List(){var r,t,n,e=this
for(r=arguments.length,t=Array(r),n=0;r>n;n++)t[n]=arguments[n]
Array.from(t,function(r,t){return e[t]=r}),this.length=t.length}var l=new List(1,2,3,4)

And of course, make one manually for the sake of comparison:

function List(...e){Array.from(e,(n,i)=>this[i]=n)
this.length=e.length}var l=new List(1,2,3,4)

Check this size comparison:

$ tree -h
.
├── [ 185]  post-babel-uglify.js
├── [ 305]  post-babel.js
├── [  96]  post-manual-minification.js
└── [ 137]  source.js

A difference of eliminating 41 bytes, vs adding 48 bytes.

@kzc
Copy link
Contributor

kzc commented Feb 20, 2016

Not surprising.

You're always welcome to contribute a pull request to get the uglify harmony branch into shape. We're all volunteering our time here.

@mikesherov
Copy link

I assume this is because Acorn and Esprima now use the ESTree AST format which was agreed upon a while ago. When the es6 bits were standardized, slight variations from the then-unstable proposed SpiderMonkey notation for arrow functions were added. Fix seems simple enough.

@rwaldron
Copy link
Author

Thanks, but we're too busy keeping JSHint up to date to do the work for every other project.

Sorry, I was just frustrated and being a jerk. Please accept my apology

@kzc
Copy link
Contributor

kzc commented Feb 21, 2016

No worries.

Could you please prefix this Issue's title with [ES6] for future reference?

For what it's worth, the UglifyJS2 harmony branch (largely coded by @fabiosantoscode) does a comparable job to minimizing ES6 compared to your manual effort:

git clone https://github.com/mishoo/UglifyJS2.git
cd UglifyJS2
# checkout HEAD of harmony branch
git checkout 0b303379c0cdc33a8c14c97ab29148d981b4887e
npm install
cat << EOF | bin/uglifyjs -c -m
function List(...items) {
  Array.from(items, (arg, i) => this[i] = arg);
  this.length = items.length;
}
var l = new List(1, 2, 3, 4);
EOF

Above script produces the output:

function List(...t){Array.from(t,((t,n)=>this[n]=t)),this.length=t.length}var l=new List(1,2,3,4);

Are there any other javascript-based ES6-to-ES6 minifiers that do a decent job? It's not clear to me whether babel presently supports ES6-to-ES6 minifying.

@fabiosantoscode
Copy link
Contributor

Hello! I skipped the moz-ast to uglifyjs-ast translation in es6 features, except for some little things!

It's probably normal that its support is not super complete yet. A quick check reveals that the word "arrow" is not even in the lib/mozilla-ast file.

@avdg
Copy link
Contributor

avdg commented Dec 2, 2016

Related to #1117

@alexlamsl alexlamsl changed the title UglifyJS.AST_Node.from_mozilla_ast(acorn ast) compatibility issues [ES6] UglifyJS.AST_Node.from_mozilla_ast(acorn ast) compatibility issues Feb 24, 2017
@kzc
Copy link
Contributor

kzc commented May 23, 2017

@alexlamsl If you're looking for a challenge, supporting full ES6 ESTree conversion to/from Uglify AST would be a very useful addition. That way uglify could optionally use the acorn parser, and interoperate with the ES2015 ESTree ecosystem of tools.

The describe_ast() differences between 3.x and uglify-es:

@@ -10 +10 @@
-                AST_Lambda (name argnames uses_arguments) "Base class for functions" {
+                AST_Lambda (name argnames uses_arguments is_generator) "Base class for functions" {
@@ -12,0 +13 @@
+                    AST_Arrow "An ES6 Arrow function ((a) => b)"
@@ -14,0 +16,4 @@
+                AST_Class (name extends properties) "An ES6 class" {
+                    AST_DefClass "A class definition"
+                    AST_ClassExpression "A class expression."
+                }
@@ -34 +39,3 @@
-                AST_ForIn (init name object) "A `for ... in` statement"
+                AST_ForIn (init name object) "A `for ... in` statement" {
+                    AST_ForOf "A `for ... of` statement"
+                }
@@ -49 +56 @@
-        AST_Definitions (definitions) "Base class for `var` nodes (variable declarations/initializations)" {
+        AST_Definitions (definitions) "Base class for `var` or `const` nodes (variable declarations/initializations)" {
@@ -50,0 +58,2 @@
+            AST_Let "A `let` statement"
+            AST_Const "A `const` statement"
@@ -51,0 +61 @@
+        AST_Export (exported_definition exported_value is_default exported_names module_name) "An `export` statement"
@@ -52,0 +63,8 @@
+    AST_Expansion (expression) "An expandible argument, such as ...rest, a splat, such as [1,2,...all], or an expansion in a variable declaration, such as var [first, ...rest] = list"
+    AST_ArrowParametersOrSeq (expressions) 'A set of arrow function parameters or a sequence expression. This is used because when the parser sees a "(" it could be the start of a seq, or the start of a parameter list of an arrow function.'
+    AST_Destructuring (names is_array) "A destructuring of several names. Used in destructuring assignment and with destructuring function argument names"
+    AST_PrefixedTemplateString (template_string prefix) "A templatestring with a prefix, such as String.raw`foobarbaz`"
+    AST_TemplateString (segments) "A template string literal"
+    AST_TemplateSegment (value raw) "A segment of a template string literal"
+    AST_NameImport (foreign_name name) "The part of the import statement that imports names from a module."
+    AST_Import (imported_name imported_names module_name) "An `import` statement"
@@ -67,0 +86 @@
+        AST_DefaultAssign "A default assignment expression like in `(a = 3) => a`"
@@ -74,2 +93,3 @@
-        AST_ObjectSetter "An object setter property"
-        AST_ObjectGetter "An object getter property"
+        AST_ObjectSetter (quote static) "An object setter property"
+        AST_ObjectGetter (quote static) "An object getter property"
+        AST_ConciseMethod (quote static is_generator) "An ES6 concise method inside an object or class"
@@ -78,2 +98 @@
-        AST_SymbolAccessor "The name of a property accessor (setter/getter function)"
-        AST_SymbolDeclaration (init) "A declaration symbol (symbol in var, function name or argument, symbol in catch)" {
+        AST_SymbolDeclaration (init) "A declaration symbol (symbol in var/const, function name or argument, symbol in catch)" {
@@ -82,0 +102,7 @@
+            AST_SymbolBlockDeclaration "Base class for block-scoped declaration symbols" {
+                AST_SymbolConst "A constant declaration"
+                AST_SymbolLet "A block-scoped `let` declaration"
+                AST_SymbolDefClass "Symbol naming a class's name in a class declaration. Lexically scoped to its containing scope, and accessible within the class."
+                AST_SymbolCatch "Symbol naming the exception in catch"
+                AST_SymbolImport "Symbol refering to an imported name"
+            }
@@ -85 +111 @@
-            AST_SymbolCatch "Symbol naming the exception in catch"
+            AST_SymbolClass "Symbol naming a class's name. Lexically scoped to the class."
@@ -86,0 +113,2 @@
+        AST_SymbolMethod "Symbol in an object defining a method"
+        AST_SymbolImportForeign "A symbol imported from a module, but it is defined in the other module, and its real name is irrelevant for this module's purposes"
@@ -90,0 +119 @@
+        AST_Super "The `super` symbol"
@@ -91,0 +121 @@
+    AST_NewTarget "A reference to new.target"
@@ -107,0 +138 @@
+    AST_Yield (expression is_star) "A `yield` statement"

@alexlamsl
Copy link
Collaborator

@kzc I guess the exercise involves making up some code snippets that would generate these AST_Node types, then feed it through acorn to try and work out what the corresponding new ESTree class names etc.

The bigger problem is making sure the code snippets cover all corner cases of these node types.

@kzc
Copy link
Contributor

kzc commented May 23, 2017

The bigger problem is making sure the code snippets cover all corner cases of these node types.

There's no shortage of ES6 code bases on github. :-)

@kzc
Copy link
Contributor

kzc commented May 23, 2017

At least one of those new AST classes looks like an implementation detail bleeding through: AST_ArrowParametersOrSeq. Some of the harmony classes may need some refactoring.

@alexlamsl
Copy link
Collaborator

There's no shortage of ES6 code bases on github.

At least in terms of prioritising which bits to fix up first, that would come in handy.

Some of the harmony classes may need some refactoring.

I have so far kept a safe distance away from lib/ast.js on harmony to preserve sanity. But I guess I'll have to go through it eventually... 🙈

@kzc
Copy link
Contributor

kzc commented May 23, 2017

I have so far kept a safe distance away from lib/ast.js on harmony to preserve sanity. But I guess I'll have to go through it eventually...

Think of it this way - you get a chance to learn ES6! It's the future - so I'm told.

@alexlamsl
Copy link
Collaborator

I was hoping Skynet to have activated by now. 🤖

@kzc
Copy link
Contributor

kzc commented May 23, 2017

Like my father used to say, and his father before him: "The best way to learn a computer language is to minify it."

@alexlamsl
Copy link
Collaborator

UGLIFY-COBOL must have been fun.

@fabiosantoscode
Copy link
Contributor

I'm really sorry about AST_ArrowParametersOrSeq! I also hated it.

It was necessary because you don't know whether you're parsing an arrow function or a seq (parentheses with commas) until the parentheses close. I think we could get around it by changing the toleniser to peek and count parens until they close and peek to check for a => token

@kzc
Copy link
Contributor

kzc commented May 23, 2017

@fabiosantoscode No worries. Thanks for the tip and giving us a working ES6 implementation from which to build on!

@kzc
Copy link
Contributor

kzc commented May 23, 2017

@alexlamsl Could you post a JS input that will exercise this line in the harmony parser?

https://github.com/mishoo/UglifyJS2/blob/075b648bb1d50d5f2b4341ffb88c4da20a4b4e77/lib/parse.js#L2715

I think it's not reachable. npm test runs fine with this patch:

--- a/lib/parse.js
+++ b/lib/parse.js
@@ -2710,13 +2710,7 @@ function parse($TEXT, options) {
             commas = true;
         }
         if (exprs.length == 1) {
-            var expr = exprs[0];
-            if (!(expr instanceof AST_SymbolRef) || !is("arrow", "=>")) return expr;
-            return arrow_function(new AST_ArrowParametersOrSeq({
-                start: expr.start,
-                end: expr.end,
-                expressions: [expr]
-            }));
+            return exprs[0];
         }
         return new AST_Sequence({
             start       : start,

@fabiosantoscode
Copy link
Contributor

fabiosantoscode commented May 23, 2017 via email

@kzc
Copy link
Contributor

kzc commented May 23, 2017

x(x=>x)

No luck...

--- a/lib/parse.js
+++ b/lib/parse.js
@@ -2711,7 +2711,9 @@ function parse($TEXT, options) {
         }
         if (exprs.length == 1) {
             var expr = exprs[0];
+            console.error("*** before if");
             if (!(expr instanceof AST_SymbolRef) || !is("arrow", "=>")) return expr;
+            console.error("*** after if");
             return arrow_function(new AST_ArrowParametersOrSeq({
                 start: expr.start,
                 end: expr.end,
$ echo 'x(x=>x)' | bin/uglifyjs
*** before if
*** before if
*** before if
x(x=>x);

@alexlamsl
Copy link
Collaborator

I think it's not reachable. npm test runs fine with this patch:

I've one-up yours and yeah, nothing crashes:

             if (!(expr instanceof AST_SymbolRef) || !is("arrow", "=>")) return expr;
+croak("RAWR");
             return arrow_function(new AST_ArrowParametersOrSeq({

If you look at maybe_assign() which is the function filling in all the exprs elements, it already handles arrows:

https://github.com/mishoo/UglifyJS2/blob/075b648bb1d50d5f2b4341ffb88c4da20a4b4e77/lib/parse.js#L2707
https://github.com/mishoo/UglifyJS2/blob/075b648bb1d50d5f2b4341ffb88c4da20a4b4e77/lib/parse.js#L2674-L2682

So I agree this code is no longer reachable on latest harmony.

@kzc
Copy link
Contributor

kzc commented May 23, 2017

So I agree this code is no longer reachable on latest harmony.

Should just replace it with the implementation on master. Might improve the speed of uglify-es parse somewhat as that code is executed on every expression.

@alexlamsl
Copy link
Collaborator

Should just replace it with the implementation on master.

Except that it doesn't handle spread in that case, so the specialised logic is still required.

PR coming up in a moment.

@kzc
Copy link
Contributor

kzc commented Jun 27, 2017

The code is intended to execute in a Node.js runtime which runs on a limited resource, remote SoC. There is benefit to "compressing" the source code size before deploying it to the remote board.

@rwaldron I stumbled across the project you were likely referring to and see it's now using uglify-es:

tessel/t2-cli#1247

These additional uglify.minify() options might shave a few more bytes off the deployed code size:

    compress: {
        ecma: 6,
        passes: 3,
        pure_getters: true,
        // other existing options
    },
    output: {
        ecma: 6,
        // other existing options
    },

@rwaldron
Copy link
Author

@kzc oh nice! I will definitely pursue this suggestion :)

@fabiosantoscode
Copy link
Contributor

I'm currently having a look at this issue. A file that I made combining the ES6 features showcased in test/compress/harmony.js is parsed, converted to mozilla AST, and back, and then compared that to the originally parsed tree in a mocha test, which is passing. I'm pretty close.

@fabiosantoscode
Copy link
Contributor

This has been fixed as of my repository https://github.com/fabiosantoscode/uglify-es

alexlamsl added a commit to alexlamsl/UglifyJS that referenced this issue Apr 25, 2021
alexlamsl added a commit to alexlamsl/UglifyJS that referenced this issue Apr 25, 2021
alexlamsl added a commit to alexlamsl/UglifyJS that referenced this issue Apr 25, 2021
alexlamsl added a commit to alexlamsl/UglifyJS that referenced this issue Apr 25, 2021
alexlamsl added a commit that referenced this issue Apr 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants