Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the code generator write-only to avoid exponential time generation #3565

Merged
merged 12 commits into from Jul 7, 2016

Conversation

loganfsmyth
Copy link
Member

There's a bit of refactoring in here to get this all to work, but mostly it's the last two commits that are the meat of it:

  1. Factor out all of the code responsible for mutating the output buffer into a discrete set of actions.
  2. Re-implement those actions with a temporary queue so that queued changes can be reversed without requiring any reads from the output string itself.

Our old logic constantly referenced the output value, meaning every reference got more expensive as the output for larger, leading to an exponential increase in codegen times.

For example, using the example script from istanbuljs/babel-plugin-istanbul#5 (comment) which generates doubling AST array, generation time changes as shown:

Items: the size of the array being generated
Time: The time in ms to generate the code
Length: The number of characters in the output code

Items: 2 ,  time: 9 ,   length: 239
Items: 4 ,  time: 2 ,   length: 465
Items: 8 ,  time: 6 ,   length: 917
Items: 16 , time: 6 ,   length: 1840
Items: 32 , time: 15 ,  length: 3696
Items: 64 , time: 25 ,  length: 7408
Items: 128 ,    time: 93 ,  length: 14917
Items: 256 ,    time: 380 , length: 30149
Items: 512 ,    time: 1399 ,    length: 60613
Items: 1024 ,   time: 5301 ,    length: 121614
Items: 2048 ,   time: 20676 ,   length: 246542

to

Items: 2 ,  time: 7 ,   length: 239
Items: 4 ,  time: 5 ,   length: 465
Items: 8 ,  time: 5 ,   length: 917
Items: 16 , time: 6 ,   length: 1840
Items: 32 , time: 11 ,  length: 3696
Items: 64 , time: 3 ,   length: 7408
Items: 128 ,    time: 13 ,  length: 14917
Items: 256 ,    time: 18 ,  length: 30149
Items: 512 ,    time: 45 ,  length: 60613
Items: 1024 ,   time: 63 ,  length: 121614
Items: 2048 ,   time: 117 , length: 246542
Items: 4096 ,   time: 266 , length: 496398
Items: 8192 ,   time: 460 , length: 996110
Items: 16384 ,  time: 980 , length: 2014687
Items: 32768 ,  time: 2008 ,    length: 4062687
Items: 65536 ,  time: 3819 ,    length: 8158687
Items: 131072 , time: 7359 ,    length: 16443904

@codecov-io
Copy link

codecov-io commented Jul 5, 2016

Current coverage is 87.79%

Merging #3565 into master will decrease coverage by 0.12%

@@             master      #3565   diff @@
==========================================
  Files           194        194          
  Lines          9640       9592    -48   
  Methods        1101       1099     -2   
  Messages          0          0          
  Branches       2204       2198     -6   
==========================================
- Hits           8475       8421    -54   
- Misses         1165       1171     +6   
  Partials          0          0          

Powered by Codecov. Last updated by c561312...65a677d

space(force: boolean = false) {
if (this._format.compact) return;

if ((this._buf.hasContent() && !this.endsWith(" ") && !this.endsWith("\n")) || force) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small micro optimization: Maybe putting force at the beginning, as the other checks seem more expensive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it this way because the force param here is only true when printing VariableDeclaration nodes, so on the whole it's probably going to be false the vast majority of the time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@hzoo hzoo Jul 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see endsWith has if (Array.isArray(str)) return str.some((s) => this.endsWith(s)); so this could be an array [" ", "\n"]?

Or we remove that (either way the name seems weird as str)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll ditch the array, we don't need it.

@hzoo hzoo added area: perf PR: Polish 💅 A type of pull request used for our changelog categories labels Jul 5, 2016
@hzoo
Copy link
Member

hzoo commented Jul 5, 2016

👍

@loganfsmyth loganfsmyth merged commit 193b9b5 into babel:master Jul 7, 2016
@loganfsmyth loganfsmyth deleted the codegen-append-only branch July 7, 2016 01:33
return;
getLast(): string {
if (this._queue.length > 0) {
const last = this._queue[this._queue.length - 1][0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be this._queue[0][0]? Right now, we're checking the last char of the item queued first (since we #unshift), not the item most recently queued.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, it should be. Luckily, looks like in this case, the only place getLast is used is https://github.com/babel/babel/blob/master/packages/babel-generator/src/printer.js#L110, and the only cases that matter are non-queued tokens, so the queue case pretty much doesn't matter anyway.

jridgewell added a commit to jridgewell/babel that referenced this pull request Jul 15, 2016
We can eek out a bit more speed from Babel generator by turning the
buffer into an array as well.
Re: babel#3565

```
Items: 2 , time: 4 length: 114
Items: 4 , time: 3 length: 218
Items: 8 , time: 3 length: 426
Items: 16 , time: 2 length: 861
Items: 32 , time: 5 length: 1741
Items: 64 , time: 2 length: 3501
Items: 128 , time: 4 length: 7106
Items: 256 , time: 8 length: 14530
Items: 512 , time: 12 length: 29378
Items: 1024 , time: 24 length: 59147
Items: 2048 , time: 38 length: 121611
Items: 4096 , time: 71 length: 246539
Items: 8192 , time: 131 length: 496395
Items: 16384 , time: 350 length: 1015260
Items: 32768 , time: 573 length: 2063836
Items: 65536 , time: 1263 length: 4160988
Items: 131072 , time: 2143 length: 8448509
Items: 262144 , time: 4859 length: 17230333
```

to

```
Items: 2 , time: 4 length: 114
Items: 4 , time: 3 length: 218
Items: 8 , time: 9 length: 426
Items: 16 , time: 1 length: 861
Items: 32 , time: 5 length: 1741
Items: 64 , time: 1 length: 3501
Items: 128 , time: 3 length: 7106
Items: 256 , time: 7 length: 14530
Items: 512 , time: 9 length: 29378
Items: 1024 , time: 17 length: 59147
Items: 2048 , time: 30 length: 121611
Items: 4096 , time: 61 length: 246539
Items: 8192 , time: 113 length: 496395
Items: 16384 , time: 307 length: 1015260
Items: 32768 , time: 443 length: 2063836
Items: 65536 , time: 1065 length: 4160988
Items: 131072 , time: 1799 length: 8448509
Items: 262144 , time: 4217 length: 17230333
```
@lock lock bot added the outdated A closed issue/PR that is archived due to age. Recommended to make a new issue label Oct 7, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area: perf outdated A closed issue/PR that is archived due to age. Recommended to make a new issue PR: Polish 💅 A type of pull request used for our changelog categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants