Bundle arithmetic now supports parens. #535

arackaf · 2016-03-26T16:52:37Z

I really, really wanted to support this without re-writing your tokenizing code, but there was just no way. A regex ninja may have been able to pull it off, but even then it would have been literally impossible to support nested parens. So the code now manually walks the string, pulling out what it needs. All your prior code to account for canonicalization and single-modules I left as is.

The tests I previously reported are still failing, but it does look like it's a Windows-only issue; they were failing from the moment I cloned the repo.

Tests updated. Docs updated.

…entheses-support-in-bundle-arithmetic

guybedford · 2016-03-27T14:43:01Z

Wow, this is really amazing. Thank you so much for working on this.

The main feedback I have is to handle syntax errors. For example getNextOperator should verify that the given character is a valid operator, and getNextIdentifier should verify that the given identifier is not an operator character (as was provided by https://github.com/systemjs/builder/pull/535/files#diff-2e0496fccad838c52136f5d29f3e85e2L27). The other thing that opens up here is to throw a syntax error when there are unmatching parenthesis too? Or at least confirming what happens currently makes sense there? +1 to the whitespace handling :)

The rest of the feedback would all just be code style and naming stuff (eg since it's no longer operator / module name, changing the name to operator / expression everywhere), which I know can be a pain, but let me know if you're happy for me to provide that sort of feedback? Otherwise I can always do a manual merge with my own mostly stylistic edits (keeping your commit and attribution in-tact) if you're ok with that too.

…entheses-support-in-bundle-arithmetic

arackaf · 2016-03-27T17:09:58Z

Ah - good feedback. I've got your master merged in locally (ostensibly I should now be able to npm run test so thanks for that.) I'll add some better error handling - question - do you have tests for this already? I don't see anything for bundling something like bundle.trace('a +') and catching an expected error.

Do you have them which I'm missing, or should I add?

And WebStorm handles local variable renames quite simply, so that's no big deal at all.

I should have that done tonight.

guybedford · 2016-03-27T17:28:10Z

No there aren't any tests for the syntax error cases currently, if you'd like to add these that is welcome.

guybedford · 2016-03-27T17:28:34Z

And just let me know if / when you want me to go ahead with the style feedback?

arackaf · 2016-03-27T18:11:37Z

OH - yeah go ahead and shoot with the style feedback. I tried my best to match your styles, so sorry about that if I didn't quite.

arackaf · 2016-03-27T22:00:03Z

@guybedford - If you don't like the wording on these, now's the time to tell me :)

arackaf · 2016-03-27T22:08:58Z

Ok I renamed that variable, and also added some needed validation. (The validation messages do NOT normally show in the test console, unless you manually enable them).

Validation checks verify against missing operators, ie a + b c, missing identifiers, ie a + b + + c and unclosed parens.

Let me know if you think there's anything else needed.

guybedford · 2016-03-28T06:46:58Z

lib/arithmetic.js

+      expandPromise = expandPromise.then(function() {
+        return Promise.resolve(expandedTreePromise)
+               .then(function(fullTree){
+                 expandedOperations = expandedOperations.concat({ preExpandedTree: fullTree, operator: operation.operator })


Can this still be called bulkOperation instead of preExpandedTree or am I missing something here?

Oh right I see what you are doing here, running the actual arithmetic during this phase recursively.

To avoid conflating steps I think I'd rather take one of two approaches here:

Instead of a single traceExpression separate it out into traceExpression and traceBulkExpression, where traceBulkExpression handles this sort of logic instead.

During this expansion step, recursively expand globs only, leaving the arithmetic syntax in-tact.

I'd be happy with either method here, let me know if that makes sense.

To clarify the reason being here that each "step" of the algorithm should have a clear purpose. To do globbing and paren expansion together is an arbitrary conflation just to take advantage of the same loop, something which can rather be split up into two loops (corresponding to either (1) or (2)).

guybedford · 2016-03-28T07:02:12Z

One other thing here - previously expressions were defined as being separated by the string matching /\s[\+\-\&]\s/. Since we now allow arbitrary spaces as the terminator I think this will break the ability to support spaces in file names.

That is, currently we can do:

jspm bundle 'My App/x.js + My App/y.js'

With this PR it would no longer be possible to bundle paths with spaces in the names.

So unfortunately that seems like a tokenizing adjustment may be necessary to use a regex like the above. We could still still adjust this regex to be "space-eating" like /\s+[\+\-&]\s+/.

guybedford · 2016-03-28T07:03:13Z

And thanks for being open to all this feedback. If you're getting sick of it let me know and I'm happy to jump in too.

arackaf · 2016-03-28T14:03:13Z

The problem with splitting with a regex isn't just supporting more than one space: that was the last thing I cared about; it was just a bonus that I wound up being able to support multiple spaces. The problem is that it's hard to get a regex to split in a way that leaves parenthetical expressions alone, and of course it's impossible to have a nested expression that's parsed by a regex, ie (a + b + (c & d)) [citation] :-) http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

Even if we were to ditch nested expressions, which I'd hate to do, you wind up with this sort of unholy mess

var expressionString = 'x + (a + b + (c & d))'
console.log(expressionString.split(/(\s+\(?=.*\)\s+|[\+\-\&])/));

which still doesn't even work - it's aggressively splitting up the inner expression. And again, even if we could tweak the lookahead to work, it still would never / could never support nested parens.

So can we support filenames with spaces by just quoting them? Ie builder.trace('"My file 1.js" + file2.js');

I could separate out traceExpression with traceExpressionBulk, but they would share a non-trivial amount of code, with the latter being recursive in the same way. For example, consider

bundle.trace('(a + b + (c & d))')

traceExpressionBulk would get a + b + (c & d) and would wind up calling itself to get the results of c & d.

It would be something vaguely like what's below, though that's literally untested and almost certainly wrong. That would leave me duplicated code to handle single operations, like a + b which should probably be re-factored to yet another utility method.

Are you sure we can't leave this in the single, recursive traceExpression method? Parsing things like (a + b + (c & d)) is inherently recursive. Recursively calling the same traceExpression to handle nested expressions seems like a fairly natural solution.

javascript
exports.traceExpression = function traceExpression(builder, expression, traceOpts) {
if (!expression)
throw new Error('A module expression must be provided to trace.');

var operations = parseExpression(expression);
var expandedOperations = [];

// expand any globbing operations in the expression
var expandPromise = Promise.resolve();
operations.forEach(function(operation) {
if (operation.bulkOperation) {
traceExpressionBulk(builder, operation.bulkOperation, traceOpts,expandedOperations);
} else if (operation.operator) {
expandPromise = expandPromise.then(function() {
return Promise.resolve(expandGlobAndCanonicalize(builder, operation))
.then(function (expanded) {
expandedOperations = expandedOperations.concat(expanded);
});
});
}
});

return expandPromise.then(function() {
// chain the operations, applying them with the trace of the next module
return expandedOperations.reduce(function(p, op) {
return p.then(function(curTree) {
// tree . module
if (op.singleModule)
return getTreeModuleOperation(builder, op.operator)(curTree, op.moduleName);

    if (op.preExpandedTree){
      return getTreeOperation(op.operator)(curTree, op.preExpandedTree);
    }
    // tree . tree
    return builder.tracer.traceCanonical(op.moduleName, traceOpts)
    .then(function(nextTrace) {
      return getTreeOperation(op.operator)(curTree, nextTrace.tree);
    });
  });
}, Promise.resolve({}));

});
};

exports.traceExpressionBulk = function traceExpressionBulk(builder, expression, traceOpts, expandedOperations) {
var operations = parseExpression(expression);

// expand any globbing operations in the expression
var expandPromise = Promise.resolve();
operations.forEach(function(operation) {
if (operation.bulkOperation) {
var expandedTreePromise = traceExpressionBulk(builder, operation.bulkOperation, traceOpts)
expandPromise = expandPromise.then(function() {
return Promise.resolve(expandedTreePromise)
.then(function(fullTree){
expandedOperations = expandedOperations.concat({ preExpandedTree: fullTree, operator: operation.operator })
});
});
} else if (operation.operator) {
expandPromise = expandPromise.then(function() {
return Promise.resolve(expandGlobAndCanonicalize(builder, operation))
.then(function (expanded) {
expandedOperations = expandedOperations.concat(expanded);
});
});
}
});
};

arackaf · 2016-03-28T17:49:00Z

Thinking about this more at the gym .... I think the tl;dr is that your builder now supports child expressions. So parseExpression now gets recursively called to handle each such child expression.

I don't think this is a terribly unnatural implementation, but I'd be happy to see alternatives if you really think they're necessary.

Also, for the filenames with the spaces - would your canonicalization stuff handle that? I haven't dug into that part of the code to see what it does, and it's not documented, so I'm not sure if

builder.bundle("a.js + `my file.js` + b.js"

would work.

guybedford · 2016-03-28T18:56:45Z

I don't mean splitting with a regex from the beginning, rather just updating the readOperator step to read using the /\s+[\+\-\&]\s+/ regex rather. that way spaces in module names can continue to be supported.

For the other thing. The problem is not so much that parseExpression is recursive, which is completely right, rather just about the "expand and canonicalize" step being separate to the "evaluate" step. That is because expand and canonicalize is a pre-processing of the expression, while calculating the expression value is a later step conceptually. For example, in future we may make expression simplifications at a symbolic level, allowing us to cut out unnecessary execution. That way we could allow (A + B) - A to not need to trace A at all. If we mix expression evaluation with the pre-compilation, it will make this sort of stuff harder, that is all. So it's just a conceptual separation, which is exactly a duplication of the loop.

arackaf · 2016-03-28T23:41:15Z

@guybedford it took me an embarrassing amount of time to figure out what you meant, but I finally got it (I think). And I gained a much better understanding of the code in the process.

Filenames with spaces are still not done yet, but please take a look at my last commit: I think it achieves what you were wanting with the expression parsing/globbing/canonicalization all happening prior to the actual trees being computed and intersected / added / subtracted.

Thanks for calling this out - the code is much more nicely organized.

arackaf · 2016-03-29T05:10:03Z

Ok - last commit supports files with spaces. Hopefully this is good to go now.

guybedford · 2016-03-29T12:28:08Z

Thanks for your great work on this!

arackaf · 2016-03-29T13:44:26Z

@guybedford my pleasure. Can you drop me a quick line, here or elsewhere, when this is up on npm?

guybedford · 2016-03-30T12:13:03Z

@arackaf sure this has been released in 0.15.14... jspm release to follow now.

arackaf added 2 commits March 26, 2016 11:44

Bundle arithmetic now supports parens.

5df5874

Tests updated. Docs updated.

Merge branch 'master' of https://github.com/systemjs/builder into par…

1c14260

…entheses-support-in-bundle-arithmetic

Merge branch 'master' of https://github.com/systemjs/builder into par…

d73acd7

…entheses-support-in-bundle-arithmetic

Validation for bundle arithmetic

909e6a4

guybedford reviewed Mar 28, 2016
View reviewed changes

parsing and canonicalizing the expression prior to evaluating the trees.

30bd17f

support for files with spaces in them

48bc668

guybedford merged commit b5cf432 into systemjs:master Mar 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bundle arithmetic now supports parens. #535

Bundle arithmetic now supports parens. #535

arackaf commented Mar 26, 2016

guybedford commented Mar 27, 2016

arackaf commented Mar 27, 2016

guybedford commented Mar 27, 2016

guybedford commented Mar 27, 2016

arackaf commented Mar 27, 2016

arackaf commented Mar 27, 2016

arackaf commented Mar 27, 2016

guybedford Mar 28, 2016

guybedford Mar 28, 2016

guybedford Mar 28, 2016

guybedford commented Mar 28, 2016

guybedford commented Mar 28, 2016

arackaf commented Mar 28, 2016

arackaf commented Mar 28, 2016

guybedford commented Mar 28, 2016

arackaf commented Mar 28, 2016

arackaf commented Mar 29, 2016

guybedford commented Mar 29, 2016

arackaf commented Mar 29, 2016

guybedford commented Mar 30, 2016

Bundle arithmetic now supports parens. #535

Bundle arithmetic now supports parens. #535

Conversation

arackaf commented Mar 26, 2016

guybedford commented Mar 27, 2016

arackaf commented Mar 27, 2016

guybedford commented Mar 27, 2016

guybedford commented Mar 27, 2016

arackaf commented Mar 27, 2016

arackaf commented Mar 27, 2016

arackaf commented Mar 27, 2016

guybedford Mar 28, 2016

Choose a reason for hiding this comment

guybedford Mar 28, 2016

Choose a reason for hiding this comment

guybedford Mar 28, 2016

Choose a reason for hiding this comment

guybedford commented Mar 28, 2016

guybedford commented Mar 28, 2016

arackaf commented Mar 28, 2016

arackaf commented Mar 28, 2016

guybedford commented Mar 28, 2016

arackaf commented Mar 28, 2016

arackaf commented Mar 29, 2016

guybedford commented Mar 29, 2016

arackaf commented Mar 29, 2016

guybedford commented Mar 30, 2016