New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve generator performance #3283

Merged
merged 6 commits into from Jan 22, 2016

Conversation

Projects
None yet
9 participants
@gzzhanghao
Contributor

gzzhanghao commented Jan 19, 2016

This PR aims to fix the performance issue for babel-generator, see T6884.

Here is the result from the same app in the discussion.

===== Origin =====
babylon 265
babel generator 2238
acorn 107
escodegen 355
esprima 95
escodegen 322
===== Optimized =====
babylon 296
babel generator 662
acorn 113
escodegen 355
esprima 106
escodegen 317

@gzzhanghao gzzhanghao changed the title from Improve generator performance T6884 to Improve generator performance Jan 19, 2016

@codecov-io

This comment has been minimized.

Show comment
Hide comment
@codecov-io

codecov-io Jan 19, 2016

Current coverage is 85.12%

Merging #3283 into master will decrease coverage by -0.18% as of e9fae4c

@@            master   #3283   diff @@
======================================
  Files          215     215       
  Stmts        15751   15703    -48
  Branches      3373    3360    -13
  Methods          0       0       
======================================
- Hit          13436   13367    -69
- Partial        679     694    +15
- Missed        1636    1642     +6

Review entire Coverage Diff as of e9fae4c

Powered by Codecov. Updated on successful CI builds.

codecov-io commented Jan 19, 2016

Current coverage is 85.12%

Merging #3283 into master will decrease coverage by -0.18% as of e9fae4c

@@            master   #3283   diff @@
======================================
  Files          215     215       
  Stmts        15751   15703    -48
  Branches      3373    3360    -13
  Methods          0       0       
======================================
- Hit          13436   13367    -69
- Partial        679     694    +15
- Missed        1636    1642     +6

Review entire Coverage Diff as of e9fae4c

Powered by Codecov. Updated on successful CI builds.

Show outdated Hide outdated packages/babel-generator/src/whitespace.js
* Find a token between start and end.
*/
_findToken(test, start, end) {

This comment has been minimized.

@amasad

amasad Jan 20, 2016

Member

can you add type annotations?

@amasad

amasad Jan 20, 2016

Member

can you add type annotations?

@@ -14,6 +14,7 @@ export default class Buffer {
this._indent = format.indent.base;
this.format = format;
this.buf = "";
this.last = "";

This comment has been minimized.

@amasad

amasad Jan 20, 2016

Member

what's the value for maintaining this instance variable? Can't we just simply do this.buf[this.buf.length -1]?

@amasad

amasad Jan 20, 2016

Member

what's the value for maintaining this instance variable? Can't we just simply do this.buf[this.buf.length -1]?

This comment has been minimized.

@gzzhanghao

gzzhanghao Jan 20, 2016

Contributor

There will be a performance issue if we do that.

v8 stores strings created from concatenation as a ConsString. And when we access any character of it, v8 has to perform a flatten operation, that will cost a bunch of time in our case.

Here is a simple node app for testing such a feature.

var code = require('fs').readFileSync('jquery.js', 'utf-8');

var step = 200;
var pieces = [];
for (var i = 0, ii = code.length; i < ii; i += step) {
  pieces.push(code.substring(i, step));
}

var buf = '';
var ts = Date.now();
for (var i = pieces.length - 1; i >= 0; i--) {
  buf += pieces[i];
}
console.log('concat', Date.now() - ts);

ts = Date.now();
buf[0]; // flatten here
console.log('read first', Date.now() - ts);

ts = Date.now();
buf[buf.length - 1];
console.log('read last', Date.now() - ts);

And here is the my result.

concat 8
read first 100
read last 0
@gzzhanghao

gzzhanghao Jan 20, 2016

Contributor

There will be a performance issue if we do that.

v8 stores strings created from concatenation as a ConsString. And when we access any character of it, v8 has to perform a flatten operation, that will cost a bunch of time in our case.

Here is a simple node app for testing such a feature.

var code = require('fs').readFileSync('jquery.js', 'utf-8');

var step = 200;
var pieces = [];
for (var i = 0, ii = code.length; i < ii; i += step) {
  pieces.push(code.substring(i, step));
}

var buf = '';
var ts = Date.now();
for (var i = pieces.length - 1; i >= 0; i--) {
  buf += pieces[i];
}
console.log('concat', Date.now() - ts);

ts = Date.now();
buf[0]; // flatten here
console.log('read first', Date.now() - ts);

ts = Date.now();
buf[buf.length - 1];
console.log('read last', Date.now() - ts);

And here is the my result.

concat 8
read first 100
read last 0

This comment has been minimized.

@amasad

amasad Jan 20, 2016

Member

interesting, thanks for explaining. I'm curious, in our case, how big is the difference?

@amasad

amasad Jan 20, 2016

Member

interesting, thanks for explaining. I'm curious, in our case, how big is the difference?

This comment has been minimized.

@gzzhanghao

gzzhanghao Jan 20, 2016

Contributor

@amasad Result from the same node app.

===== origin/master =====
babylon 250
babel generator 2205
acorn 103
escodegen 345
esprima 92
escodegen 327
===== 76e3c52 (with this.last) =====
babylon 251
babel generator 1043
acorn 104
escodegen 351
esprima 91
escodegen 317
@gzzhanghao

gzzhanghao Jan 20, 2016

Contributor

@amasad Result from the same node app.

===== origin/master =====
babylon 250
babel generator 2205
acorn 103
escodegen 345
esprima 92
escodegen 327
===== 76e3c52 (with this.last) =====
babylon 251
babel generator 1043
acorn 104
escodegen 351
esprima 91
escodegen 317

This comment has been minimized.

@amasad

amasad Jan 20, 2016

Member

oh wow, half! 👍

@amasad

amasad Jan 20, 2016

Member

oh wow, half! 👍

This comment has been minimized.

@jamiebuilds

jamiebuilds Jan 22, 2016

Member

Can we add a code comment here in case someone ever has this question in the future? It would suck if someone removed this optimization unwittingly.

@jamiebuilds

jamiebuilds Jan 22, 2016

Member

Can we add a code comment here in case someone ever has this question in the future? It would suck if someone removed this optimization unwittingly.

This comment has been minimized.

@stereokai

stereokai Jan 22, 2016

@gzzhanghao That's very interesting information. May I ask where/how did you learn that? I'd like to learn things like that as well :)

@stereokai

stereokai Jan 22, 2016

@gzzhanghao That's very interesting information. May I ask where/how did you learn that? I'd like to learn things like that as well :)

This comment has been minimized.

@hzoo

hzoo Jan 22, 2016

Member

Yeah adding a comment would be great!

@hzoo

hzoo Jan 22, 2016

Member

Yeah adding a comment would be great!

This comment has been minimized.

@gzzhanghao

gzzhanghao Jan 23, 2016

Contributor

@stereokai reading the source code may help ;)

@gzzhanghao

gzzhanghao Jan 23, 2016

Contributor

@stereokai reading the source code may help ;)

This comment has been minimized.

@stereokai

stereokai Jan 27, 2016

@gzzhanghao Ain't nobody got time fo' that! ;)
Cheers

@stereokai

stereokai Jan 27, 2016

@gzzhanghao Ain't nobody got time fo' that! ;)
Cheers

Show outdated Hide outdated packages/babel-generator/src/whitespace.js
}
let index = this._findToken(token => token.start - node.start, 0, tokens.length);
if (typeof index === "number") {
while (index && node.start === tokens[index - 1].start) --index;

This comment has been minimized.

@amasad

amasad Jan 20, 2016

Member

For correctness and clarity can we make this index > 1? same below

@amasad

amasad Jan 20, 2016

Member

For correctness and clarity can we make this index > 1? same below

This comment has been minimized.

@gzzhanghao

gzzhanghao Jan 20, 2016

Contributor

@amasad It will fail if we have a comment at the beginning of the file.

// test
// test

But we can use index >= 0 and let _findToken returns -1 when token not found, like [].indexOf.

@gzzhanghao

gzzhanghao Jan 20, 2016

Contributor

@amasad It will fail if we have a comment at the beginning of the file.

// test
// test

But we can use index >= 0 and let _findToken returns -1 when token not found, like [].indexOf.

This comment has been minimized.

@amasad

amasad Jan 20, 2016

Member

ah right because startToken is optional in getNewlinesBetween? Can we change it to index >= 0? If _findToken can return -1 then this condition will be truthy for -1 which seems wrong.

@amasad

amasad Jan 20, 2016

Member

ah right because startToken is optional in getNewlinesBetween? Can we change it to index >= 0? If _findToken can return -1 then this condition will be truthy for -1 which seems wrong.

@amasad

This comment has been minimized.

Show comment
Hide comment
@amasad

amasad Jan 20, 2016

Member

This is awesome! I'm excited for this. Mostly looks good just a couple of suggestions/questions

Member

amasad commented Jan 20, 2016

This is awesome! I'm excited for this. Mostly looks good just a couple of suggestions/questions

@amasad

This comment has been minimized.

Show comment
Hide comment
@amasad

amasad Jan 20, 2016

Member

ok, will merge tomorrow. Do you have any more ideas for optimizations? (this is awesome)

Member

amasad commented Jan 20, 2016

ok, will merge tomorrow. Do you have any more ideas for optimizations? (this is awesome)

@gzzhanghao

This comment has been minimized.

Show comment
Hide comment
@gzzhanghao

gzzhanghao Jan 20, 2016

Contributor

@amasad not now, but I'll keep finding performance issues in the code base : )

Contributor

gzzhanghao commented Jan 20, 2016

@amasad not now, but I'll keep finding performance issues in the code base : )

@hzoo

This comment has been minimized.

Show comment
Hide comment
@hzoo

hzoo Jan 20, 2016

Member

Maybe we should add some perf metrics whenever we updated some of the packages like babylon/babel-generator like in https://phabricator.babeljs.io/T6884#70050?

Unless we want to bring in a larger file, the largest built file we have is babel-polyfill/dist/polyfill

Member

hzoo commented Jan 20, 2016

Maybe we should add some perf metrics whenever we updated some of the packages like babylon/babel-generator like in https://phabricator.babeljs.io/T6884#70050?

Unless we want to bring in a larger file, the largest built file we have is babel-polyfill/dist/polyfill

@gzzhanghao

This comment has been minimized.

Show comment
Hide comment
@gzzhanghao

gzzhanghao Jan 20, 2016

Contributor

@hzoo agree, we can add some identity test case and measure the running time for each package. like escodegen's identity test

Contributor

gzzhanghao commented Jan 20, 2016

@hzoo agree, we can add some identity test case and measure the running time for each package. like escodegen's identity test

@kittens

This comment has been minimized.

Show comment
Hide comment
@kittens

kittens Jan 22, 2016

Member

Thank you @gzzhanghao, you're amazing!

Member

kittens commented Jan 22, 2016

Thank you @gzzhanghao, you're amazing!

@hzoo

This comment has been minimized.

Show comment
Hide comment
@hzoo

hzoo Jan 22, 2016

Member

Would it be possible to run a perf.js test on a PR (would be cool to get a comment on the PR like code coverage)? Or we could do it in between releases too.

We can test the parser on different files/libraries, and then the generation?

Member

hzoo commented Jan 22, 2016

Would it be possible to run a perf.js test on a PR (would be cool to get a comment on the PR like code coverage)? Or we could do it in between releases too.

We can test the parser on different files/libraries, and then the generation?

@robcolburn

This comment has been minimized.

Show comment
Hide comment
@robcolburn

robcolburn Jan 22, 2016

Contributor

Since we are using array-style access on a string, do we need worry about multi-byte characters, ex: Chinese, Emoji.

I'm not super familiar with Babel's code-base, so that may already be broken, or not something we need to concern ourselves with. A test would be great, if we don't have one yet.

Contributor

robcolburn commented Jan 22, 2016

Since we are using array-style access on a string, do we need worry about multi-byte characters, ex: Chinese, Emoji.

I'm not super familiar with Babel's code-base, so that may already be broken, or not something we need to concern ourselves with. A test would be great, if we don't have one yet.

@amasad

This comment has been minimized.

Show comment
Hide comment
@amasad

amasad Jan 22, 2016

Member

Rob: good point, can you please follow up with a task?
@hzoo: great idea! Build it :)

Member

amasad commented Jan 22, 2016

Rob: good point, can you please follow up with a task?
@hzoo: great idea! Build it :)

@hzoo

This comment has been minimized.

Show comment
Hide comment
@hzoo

hzoo Jan 22, 2016

Member

@robcolburn if we don't have a test for it already we'd appreciate a PR for some!

Member

hzoo commented Jan 22, 2016

@robcolburn if we don't have a test for it already we'd appreciate a PR for some!

@robcolburn

This comment has been minimized.

Show comment
Hide comment
@robcolburn

robcolburn Jan 22, 2016

Contributor

@hzoo I didn't see a test for it, so I PR'd a few to @gzzhanghao branch - gzzhanghao#1

Contributor

robcolburn commented Jan 22, 2016

@hzoo I didn't see a test for it, so I PR'd a few to @gzzhanghao branch - gzzhanghao#1

@amasad

This comment has been minimized.

Show comment
Hide comment
@amasad

amasad Jan 22, 2016

Member

@robcolburn Thanks, going to merge this, so can you PR to master instead?

Member

amasad commented Jan 22, 2016

@robcolburn Thanks, going to merge this, so can you PR to master instead?

amasad added a commit that referenced this pull request Jan 22, 2016

Merge pull request #3283 from gzzhanghao/gen-perf
Improve generator performance

@amasad amasad merged commit 7d719d9 into babel:master Jan 22, 2016

2 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@robcolburn

This comment has been minimized.

Show comment
Hide comment
@robcolburn
Contributor

robcolburn commented Jan 22, 2016

@amasad done #3299

@hzoo hzoo added the area: perf label Jan 23, 2016

@gzzhanghao gzzhanghao deleted the gzzhanghao:gen-perf branch Mar 6, 2016

@babel-bot babel-bot referenced this pull request Sep 7, 2016

Closed

Babel Roadmap (T7154) #4130

10 of 19 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment