Update commentRegExp for better sugared CommonJS syntax dependency parsing #1582

Merged
merged 1 commit into from Sep 4, 2016

Projects

None yet

2 participants

@jrburke
Member
jrburke commented Sep 4, 2016

This is a combo fix for the issues described here:

It uses some of the same techniques described in those tickets, but combined and with tests and passing code validation requirements. The main changes to the regexp:

  • Remove () matching groups that were not used in processing the match, to make it faster: ~500,000 ops/sec vs the old one at ~410,000 ops/sec, using the benchmark mentioned in #1538 when run in node 6.2.1 on my personal laptop. The commentReplace function was updated to remove parameters that corresponded to the match groups removed.
  • Include '"= in the "do not match if starts with those characters" part of the regexp. This allows matching better for protocol relative URLs, and for having an HTML attribute preceding the require('') call on the same line.

This does mean though that it will now falsely match on this sort of construct:

var something = 'something'// require('bad');

However, this sort of regexp based approach will always have edge cases, and for the case above, it seems unlikely to happen in the wild:

  • There would likely be a line return if the "don't use trailing semicolons" style is used. If this above case is hit, the workaround is just to place a line return after the string.
  • This form is more likely to show up in a file that has been minified, but in that case the comment should have been stripped.

So the benefits on catching more true positives seem worth this tradeoff.

Note that this regexp is only used for dynamic loading when there is no dependency array passed in the define call. If the r.js optimizer is used, it finds dependencies via an AST so it would not have any false positives or negatives associated with this regexp, and is always an option if a workaround is needed. The r.js optimizer also writes out a dependency array in the define calls, so it avoids this regexp work at runtime when an optimized build is used.

However, since this is a long-standing regexp in the system, I will rev the release version with this change to 2.3.0 to indicate a feature change. alameda will also be updated to match.

@jrburke jrburke commentRegExp faster and matches more
4c84ff2
@jquerybot jquerybot added the CLA: Valid label Sep 4, 2016
@jrburke jrburke added this to the 2.2.1 milestone Sep 4, 2016
@jrburke
Member
jrburke commented Sep 4, 2016

Using 2.2.1 Milestone for now, but will convert all the 2.2.1 bugs to 2.3.0 before wrapping up the release.

@jrburke jrburke merged commit eb4f1ff into master Sep 4, 2016

3 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
jQuery Foundation CLA All authors have signed the CLA
Details
@jrburke jrburke deleted the comment-regexp branch Sep 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment