Be able to support building JS 1.8/ES.next code #72

Closed
jrburke opened this Issue Jan 5, 2012 · 15 comments

Projects

None yet

5 participants

@jrburke
Member
jrburke commented Jan 5, 2012

Right now uglifyjs is used to get an AST to walk looking for define calls and dependencies to generate the build. This will likely fail on JS 1.8 code.

Look at swapping out the AST generator to something else if uglifyjs will not work. I've plaid with narcissus lexer/parser in the past, look at it again. Just need enough of a an AST to get the define/require calls, so it can be tolerant/skip other things.

This is related to this ticket enabling runtime loading of JS 1.8 code.

@asutherland

I assume the path forward for this now is for parse.js to use the esprima parser like transform.js does. When you are back, do you think you'll have some time to look at/do this, or should @mozsquib (the fellow who got all fancy and went and used 'let' :) or I pursue this in case you are expecting to be busy?

@jrburke
Member
jrburke commented Jul 18, 2012

I marked this with the Next label which means: "look at it for the next release, but do not block if there are critical bug fixes that should get out asap". However, your use cases are probably important for me to address, so I'll push this up in my queue.

I likely will not get to it until next week though given some vacation and travel this week, but happy to entertain a pull request in the meantime. Give a holler if someone picks it up though, so that I do not duplicate efforts. Some pointers, or at least what I was thinking of doing:

It basically means converting build/jslib/parse.js to not use uglifyjs anymore and just use esprima instead. There are some unit tests that can be run by typing the following in the build/tests directory:

./alln.sh

It assumes though that requirejs and requriejs/text are checked out as siblings to the r.js directory. If you wanted to just get a first pass done without needing to check out all those files, then comment out the 'tests/builds' line in build/tests/all.js and then just run the tests via node ../../r.js all.js

That test may still assume you need to build r.js to run the tests, do that by running this in the top level of this repo:

node dist.js

For the esprima work in parse.js, I was thinking of just using the tokenizing stream as done in the hm plugin instead of going with the AST in the hopes that it applied less contextual knowledge about JS and therefore may allow more stuff to be parsed. However, that may just be silly thinking on my part, and it may lead to wordier code.

It also may mean finding the right branch of esprima for this to work, which means grabbing that esprima.js and wrapping it in a define call as specifically done in this repo.

Or wait until next week. :)

@jrburke
Member
jrburke commented Jul 23, 2012

I did a pass at this, the r.js snapshot is available in this poorly named branch:

https://raw.github.com/jrburke/r.js/bug27-parse18/dist/r.js

This version uses esprima for all parse operations, except of course optimize: 'uglify', so do not optimize code with uglify, need a JS 1.8-aware minifier for that (not sure if there is one).

Unfortunately, I do not think it will work as-is with JS 1.8 syntax since esprima is still working on supporting destructuring:

http://code.google.com/p/esprima/issues/detail?id=241

But I think I was able to work around that issue by using onBuildRead and onBuildWrite processors in the build config for the build. Example:

https://github.com/jrburke/r.js/blob/bug27-parse18/build/tests/lib/js18/build.js

The regexps used there may not be complete/robust, so I appreciate any code samples that fail so I can update the regexps while we wait on the esprima issue for destructuring is resolved.

Please give it a try, the more we can shake it down now, the more likely I can get this in for 2.0.5.

@jimporter

It seems to work for "let", but unfortunately for me, I'm using a lot of other JS 1.8 language features. I might be able to work around them with some more regexps...

@jrburke
Member
jrburke commented Jul 25, 2012

@mozsquib if you can list out what you use, then I'll see what can be done to support them. esprima will likely support harmony stuff over time, but if there is some odd js 1.8-only things then we may need to roll our own workarounds for them.

@jimporter

Here's what I'm using that is causing problems:

  • yield
  • abbreviated syntax for functions like "function() foo"
  • array destructuring, including this hard-to-regex-for line: "[this.namespaceName, this.localTagName] = pieces;"
  • EDIT: and array comprehensions
@asutherland

For my own edification, I wanted to double check that the things we are using are in ES.next. For the current draft, these things are:

  • yield: 13.4 "Generator Definitions"
  • destructuring assignment: 11.13.1 "Destructuring Assignment". Arrays are under "ArrayAssignmentPattern"
  • array comprehensions: 11.1.4.2 "Array Comprehension"

It appears certain things are not in ES.next:

  • "function() foo". I don't see this. 13.1 "Function Definitions" provides the syntax for FunctionDeclaration and FunctionExpression, and neither of these seems to suggest optional {}'s. Arrow functions (13.2) have a ConciseBody form, however, but their syntax is totes different, yo.

So I guess fix your uses of "function() foo" and see if that makes it happier at all?

@asutherland

Er, and that is from the "July 8, 2012" "Current working draft" on http://wiki.ecmascript.org/doku.php?id=harmony:specification_drafts

@jrburke
Member
jrburke commented Jul 25, 2012

I just posted an update to the onBuild* to handle yield:

https://github.com/jrburke/r.js/blob/bug27-parse18/build/tests/lib/js18/build.js

and I'm working on the array stuff now.

@jimporter

I managed to get this working with regexes:

  onBuildRead: function (id, path, contents) {
    // Remove destructuring, minimal functions, and yields so parsing works.
    // Do this until esprima has a fix for this, among other things:
    // http://code.google.com/p/esprima/issues/detail?id=241

    var destructRe  = /((?:var|let|const)\s*)(\{[^\}]+\})(\s*(?:=|in\b))/g;
    var destructRe2 = /((?:var|let|const)\s*)(\[[^\]]+\])(\s*(?:=|in\b))/g;
    var destructRe3 = /\n(\s*)(\[[^\]]+\])(\s*(?:=|in\b))/g;
    var getterRe    = /get\s+\w+\s*\(\).*,/g;
    var minifuncRe  = /(function\s*\(.*?\))(?=\s*\w)/g;
    var yieldRe     = /yield/g;
    return contents.replace(destructRe,  '$1/*DESTRUCT$2DESTRUCT*/_$3')
                   .replace(destructRe2, '$1/*DESTRUCT$2DESTRUCT*/_$3')
                   .replace(destructRe3, '$1/*DESTRUCT$2DESTRUCT*/_$3')
                   .replace(getterRe,    '/*GETTER$&GETTER*/')
                   .replace(minifuncRe,  '/*MINIFUNC$1MINIFUNC*/')
                   .replace(yieldRe,     'return YIELD+');
  },
  onBuildWrite: function (id, path, contents) {
    // Restore destructuring, minimal functions, and yields.

    var restructRe = /\/\*DESTRUCT(.*?)DESTRUCT\*\/_/g;
    var getterRe   = /\/\*GETTER(.*?)GETTER\*\//g;
    var minifuncRe = /\/\*MINIFUNC(.*?)MINIFUNC\*\//g;
    var yieldRe    = /return YIELD\+/g;
    return contents.replace(restructRe, '$1')
                   .replace(getterRe,   '$1')
                   .replace(minifuncRe, '$1')
                   .replace(yieldRe,    'yield');
  }

I handle yield the way I do because we have "yield" in comments in various places, and surrounding yield with /**/ breaks the comments.

@jrburke
Member
jrburke commented Jul 26, 2012

@mozsquib thanks, I updated the test build.js to include your regexps, done in commit d5ec8da

@jrburke jrburke closed this in 1dafb63 Aug 3, 2012
@ariya
ariya commented Aug 9, 2012

According to http://wiki.ecmascript.org/doku.php?id=harmony:generators, yield works when the function is a generator.

Try this with Esprima's online parser (harmony version), http://esprima.googlecode.com/git-history/harmony/demo/parse.html:

function* foo() { yield true; }

Obviously, I can try to have the tolerant mode accepts the use of yield even when it's not legally allowed.

@ariya
ariya commented Aug 9, 2012

Esprima can understand array destructuring assignment:

[c, d] = [d, c];

Note that it does not work yet with variable declaration (var/let).

@jrburke
Member
jrburke commented Aug 9, 2012

@ariya right now I'm just using master esprima, had some trouble when I tried the harmony branch at the time I was wrapping up this ticket, but I believe it is better now. One weirdness, not directly related to this ticket though, was the double module token bug I filed and has already been fixed. I was also going to wait on

http://code.google.com/p/esprima/issues/detail?id=241

and figured that would probably land in the harmony branch when it was ready.

Thanks for a great parsing tool!

@darky
darky commented Dec 25, 2014

Maybe it's time to try harmony branch of esprima now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment