Improve traverse and parse performance #900

Merged
merged 1 commit into from Mar 9, 2016

Projects

None yet

2 participants

@petersondrew

Make use of Object.keys rather than for..in + hasOwnProperty, as the
latter cannot be optimized by Node and results in deoptimized function
calls in the hot path.

Before:

   ticks  total  nonlib   name
    882    4.0%    4.0%  LazyCompile: ~traverse r.js:26797:22
    248    1.1%    1.1%  Stub: CEntryStub
    207    0.9%    0.9%  LazyCompile: *hasOwnProperty native v8natives.js:111:30
    194    0.9%    0.9%  LazyCompile: *ToName native runtime.js:466:16
    153    0.7%    0.7%  LazyCompile: ~parse.recurse r.js:26980:30

After:

   ticks  total  nonlib   name
     56    0.5%    0.5%  LazyCompile: *traverse r.js:26797:22

This has a significant impact on cpu time and wall clock performance. Test suites pass as far as I can tell, however the Rhino tests suites hung up on the sourcemap tests (they did on master as well). Not sure if it's required for this change but I have submitted a CLA just in case.

@petersondrew petersondrew Improve traverse and parse performance
Make use of Object.keys rather than for..in + hasOwnProperty, as the
latter cannot be optimized by Node and results in deoptimized function
calls in the hot path.
991d88a
@jrburke jrburke added this to the 2.1.23 milestone Mar 9, 2016
@jrburke jrburke merged commit f736a69 into requirejs:master Mar 9, 2016
@jrburke
Member
jrburke commented Mar 9, 2016

This is fantastic, thank you! Tested locally and updated the snapshot in dist/r.js with the changes. Noticeable speedup for the tests, looks like it is close to twice as fast to run the node tests with the change.

CLA is much appreciated, thank you. If you filled out the Dojo one, the Dojo and jQuery foundations have merged and I'm mid-process for switching over the CLA links. While it should be fine if you used the Dojo form, it is probably best if you also do the jQuery one: https://contribute.jquery.org/CLA/ which applies to other projects like jQuery, lodash, so gives a lot of contribution possibilities for the future.

Thank you so much for the analysis and the fix, a great improvement.

@petersondrew

You're very welcome! I submitted a jQuery CLA as well to cover the bases.

@petersondrew petersondrew deleted the petersondrew:parse-perf branch Mar 10, 2016
@jrburke jrburke modified the milestone: 2.1.23, 2.2.0 Mar 15, 2016
@jrburke
Member
jrburke commented Mar 17, 2016

@petersondrew I am interested to know more about how you went about identifying the bottleneck. I am still new to those kind of tools for node. No worries though if you do not have time, just idle curiosity. I expect if I just did some more google research I would find the answer.

@petersondrew

Sure, no problem 😄
I noticed the requirejs step in my build script seemed to be taking an inordinate amount of time, especially considering uglify was disabled. To profile the build script I used the --prof flag:
node.exe --prof .\node_modules\grunt-cli\bin\grunt
That creates a tick file in the same directory, named something like isolate-0x*-v8.log. This tick file can then be processed by node to produce output similar to what I posted above using the following:
node --prof-process isolate.log > profile.txt
That file will break down the number of ticks spent in different parts of the code.

If you notice in the first snippet I posted, most of the ticks were spent in LazyCompile: ~traverse. The tilde before the function name is significant here, it means that the function was not optimized. This means it was interpreted by the v8 engine on each execution using the non-optimizing compiler (node really has 2 compilers).

Now, it's not necessarily a given that an unoptimized function will kill your performance, even in the hot path, however this particular function created a double whammy. If we dig in further, we can see that (possibly due to the recursion) node continually optimizes and de-optimizes this function over and over when it finds that its assumptions were incorrect:
node --trace-deopt --trace-opt .\node_modules\grunt-cli\bin\grunt > trace.txt
That trace file will contain a log file showing each function optimization and de-optimization, along with the reason. This reason can sometimes provide enough insight to allow you to modify it in a way that v8 can optimize it successfully. See this excellent wiki for some good examples.

Let me know if you have any other questions.

@jrburke
Member
jrburke commented Mar 17, 2016

@petersondrew, this is awesome, thank you very much! Very educational.

@samreid samreid referenced this pull request in phetsims/chipper Dec 20, 2016
Closed

Update requirejs for a speed boost in builds #532

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment