Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update commentRegExp for better sugared CommonJS syntax dependency parsing #1582

Merged
merged 1 commit into from Sep 4, 2016

Conversation

@jrburke
Copy link
Member

@jrburke jrburke commented Sep 4, 2016

This is a combo fix for the issues described here:

It uses some of the same techniques described in those tickets, but combined and with tests and passing code validation requirements. The main changes to the regexp:

  • Remove () matching groups that were not used in processing the match, to make it faster: ~500,000 ops/sec vs the old one at ~410,000 ops/sec, using the benchmark mentioned in #1538 when run in node 6.2.1 on my personal laptop. The commentReplace function was updated to remove parameters that corresponded to the match groups removed.
  • Include '"= in the "do not match if starts with those characters" part of the regexp. This allows matching better for protocol relative URLs, and for having an HTML attribute preceding the require('') call on the same line.

This does mean though that it will now falsely match on this sort of construct:

var something = 'something'// require('bad');

However, this sort of regexp based approach will always have edge cases, and for the case above, it seems unlikely to happen in the wild:

  • There would likely be a line return if the "don't use trailing semicolons" style is used. If this above case is hit, the workaround is just to place a line return after the string.
  • This form is more likely to show up in a file that has been minified, but in that case the comment should have been stripped.

So the benefits on catching more true positives seem worth this tradeoff.

Note that this regexp is only used for dynamic loading when there is no dependency array passed in the define call. If the r.js optimizer is used, it finds dependencies via an AST so it would not have any false positives or negatives associated with this regexp, and is always an option if a workaround is needed. The r.js optimizer also writes out a dependency array in the define calls, so it avoids this regexp work at runtime when an optimized build is used.

However, since this is a long-standing regexp in the system, I will rev the release version with this change to 2.3.0 to indicate a feature change. alameda will also be updated to match.

@jrburke jrburke added this to the 2.2.1 milestone Sep 4, 2016
@jrburke
Copy link
Member Author

@jrburke jrburke commented Sep 4, 2016

Using 2.2.1 Milestone for now, but will convert all the 2.2.1 bugs to 2.3.0 before wrapping up the release.

Loading

@jrburke jrburke merged commit eb4f1ff into master Sep 4, 2016
3 checks passed
Loading
@jrburke jrburke deleted the comment-regexp branch Sep 4, 2016
@jrburke jrburke mentioned this pull request Sep 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants