New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve traverse and parse performance #900

Merged
merged 1 commit into from Mar 9, 2016

Conversation

Projects
None yet
2 participants
@petersondrew

Make use of Object.keys rather than for..in + hasOwnProperty, as the
latter cannot be optimized by Node and results in deoptimized function
calls in the hot path.

Before:

   ticks  total  nonlib   name
    882    4.0%    4.0%  LazyCompile: ~traverse r.js:26797:22
    248    1.1%    1.1%  Stub: CEntryStub
    207    0.9%    0.9%  LazyCompile: *hasOwnProperty native v8natives.js:111:30
    194    0.9%    0.9%  LazyCompile: *ToName native runtime.js:466:16
    153    0.7%    0.7%  LazyCompile: ~parse.recurse r.js:26980:30

After:

   ticks  total  nonlib   name
     56    0.5%    0.5%  LazyCompile: *traverse r.js:26797:22

This has a significant impact on cpu time and wall clock performance. Test suites pass as far as I can tell, however the Rhino tests suites hung up on the sourcemap tests (they did on master as well). Not sure if it's required for this change but I have submitted a CLA just in case.

Improve traverse and parse performance
Make use of Object.keys rather than for..in + hasOwnProperty, as the
latter cannot be optimized by Node and results in deoptimized function
calls in the hot path.

@jrburke jrburke added this to the 2.1.23 milestone Mar 9, 2016

jrburke added a commit that referenced this pull request Mar 9, 2016

Merge pull request #900 from petersondrew/parse-perf
Improve traverse and parse performance

@jrburke jrburke merged commit f736a69 into requirejs:master Mar 9, 2016

@jrburke

This comment has been minimized.

Show comment
Hide comment
@jrburke

jrburke Mar 9, 2016

Member

This is fantastic, thank you! Tested locally and updated the snapshot in dist/r.js with the changes. Noticeable speedup for the tests, looks like it is close to twice as fast to run the node tests with the change.

CLA is much appreciated, thank you. If you filled out the Dojo one, the Dojo and jQuery foundations have merged and I'm mid-process for switching over the CLA links. While it should be fine if you used the Dojo form, it is probably best if you also do the jQuery one: https://contribute.jquery.org/CLA/ which applies to other projects like jQuery, lodash, so gives a lot of contribution possibilities for the future.

Thank you so much for the analysis and the fix, a great improvement.

Member

jrburke commented Mar 9, 2016

This is fantastic, thank you! Tested locally and updated the snapshot in dist/r.js with the changes. Noticeable speedup for the tests, looks like it is close to twice as fast to run the node tests with the change.

CLA is much appreciated, thank you. If you filled out the Dojo one, the Dojo and jQuery foundations have merged and I'm mid-process for switching over the CLA links. While it should be fine if you used the Dojo form, it is probably best if you also do the jQuery one: https://contribute.jquery.org/CLA/ which applies to other projects like jQuery, lodash, so gives a lot of contribution possibilities for the future.

Thank you so much for the analysis and the fix, a great improvement.

@petersondrew

This comment has been minimized.

Show comment
Hide comment
@petersondrew

petersondrew Mar 9, 2016

You're very welcome! I submitted a jQuery CLA as well to cover the bases.

You're very welcome! I submitted a jQuery CLA as well to cover the bases.

@petersondrew petersondrew deleted the petersondrew:parse-perf branch Mar 10, 2016

@jrburke jrburke modified the milestones: 2.1.23, 2.2.0 Mar 15, 2016

@jrburke

This comment has been minimized.

Show comment
Hide comment
@jrburke

jrburke Mar 17, 2016

Member

@petersondrew I am interested to know more about how you went about identifying the bottleneck. I am still new to those kind of tools for node. No worries though if you do not have time, just idle curiosity. I expect if I just did some more google research I would find the answer.

Member

jrburke commented Mar 17, 2016

@petersondrew I am interested to know more about how you went about identifying the bottleneck. I am still new to those kind of tools for node. No worries though if you do not have time, just idle curiosity. I expect if I just did some more google research I would find the answer.

@petersondrew

This comment has been minimized.

Show comment
Hide comment
@petersondrew

petersondrew Mar 17, 2016

Sure, no problem 😄
I noticed the requirejs step in my build script seemed to be taking an inordinate amount of time, especially considering uglify was disabled. To profile the build script I used the --prof flag:
node.exe --prof .\node_modules\grunt-cli\bin\grunt
That creates a tick file in the same directory, named something like isolate-0x*-v8.log. This tick file can then be processed by node to produce output similar to what I posted above using the following:
node --prof-process isolate.log > profile.txt
That file will break down the number of ticks spent in different parts of the code.

If you notice in the first snippet I posted, most of the ticks were spent in LazyCompile: ~traverse. The tilde before the function name is significant here, it means that the function was not optimized. This means it was interpreted by the v8 engine on each execution using the non-optimizing compiler (node really has 2 compilers).

Now, it's not necessarily a given that an unoptimized function will kill your performance, even in the hot path, however this particular function created a double whammy. If we dig in further, we can see that (possibly due to the recursion) node continually optimizes and de-optimizes this function over and over when it finds that its assumptions were incorrect:
node --trace-deopt --trace-opt .\node_modules\grunt-cli\bin\grunt > trace.txt
That trace file will contain a log file showing each function optimization and de-optimization, along with the reason. This reason can sometimes provide enough insight to allow you to modify it in a way that v8 can optimize it successfully. See this excellent wiki for some good examples.

Let me know if you have any other questions.

Sure, no problem 😄
I noticed the requirejs step in my build script seemed to be taking an inordinate amount of time, especially considering uglify was disabled. To profile the build script I used the --prof flag:
node.exe --prof .\node_modules\grunt-cli\bin\grunt
That creates a tick file in the same directory, named something like isolate-0x*-v8.log. This tick file can then be processed by node to produce output similar to what I posted above using the following:
node --prof-process isolate.log > profile.txt
That file will break down the number of ticks spent in different parts of the code.

If you notice in the first snippet I posted, most of the ticks were spent in LazyCompile: ~traverse. The tilde before the function name is significant here, it means that the function was not optimized. This means it was interpreted by the v8 engine on each execution using the non-optimizing compiler (node really has 2 compilers).

Now, it's not necessarily a given that an unoptimized function will kill your performance, even in the hot path, however this particular function created a double whammy. If we dig in further, we can see that (possibly due to the recursion) node continually optimizes and de-optimizes this function over and over when it finds that its assumptions were incorrect:
node --trace-deopt --trace-opt .\node_modules\grunt-cli\bin\grunt > trace.txt
That trace file will contain a log file showing each function optimization and de-optimization, along with the reason. This reason can sometimes provide enough insight to allow you to modify it in a way that v8 can optimize it successfully. See this excellent wiki for some good examples.

Let me know if you have any other questions.

@jrburke

This comment has been minimized.

Show comment
Hide comment
@jrburke

jrburke Mar 17, 2016

Member

@petersondrew, this is awesome, thank you very much! Very educational.

Member

jrburke commented Mar 17, 2016

@petersondrew, this is awesome, thank you very much! Very educational.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment