New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the code generator write-only to avoid exponential time generation #3565
Conversation
Current coverage is 87.79%@@ master #3565 diff @@
==========================================
Files 194 194
Lines 9640 9592 -48
Methods 1101 1099 -2
Messages 0 0
Branches 2204 2198 -6
==========================================
- Hits 8475 8421 -54
- Misses 1165 1171 +6
Partials 0 0
|
space(force: boolean = false) { | ||
if (this._format.compact) return; | ||
|
||
if ((this._buf.hasContent() && !this.endsWith(" ") && !this.endsWith("\n")) || force) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small micro optimization: Maybe putting force
at the beginning, as the other checks seem more expensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left it this way because the force
param here is only true
when printing VariableDeclaration
nodes, so on the whole it's probably going to be false
the vast majority of the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see endsWith has if (Array.isArray(str)) return str.some((s) => this.endsWith(s));
so this could be an array [" ", "\n"]
?
Or we remove that (either way the name seems weird as str
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll ditch the array, we don't need it.
👍 |
3e2aea6
to
65a677d
Compare
return; | ||
getLast(): string { | ||
if (this._queue.length > 0) { | ||
const last = this._queue[this._queue.length - 1][0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be this._queue[0][0]
? Right now, we're checking the last char of the item queued first (since we #unshift
), not the item most recently queued.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, it should be. Luckily, looks like in this case, the only place getLast
is used is https://github.com/babel/babel/blob/master/packages/babel-generator/src/printer.js#L110, and the only cases that matter are non-queued tokens, so the queue case pretty much doesn't matter anyway.
We can eek out a bit more speed from Babel generator by turning the buffer into an array as well. Re: babel#3565 ``` Items: 2 , time: 4 length: 114 Items: 4 , time: 3 length: 218 Items: 8 , time: 3 length: 426 Items: 16 , time: 2 length: 861 Items: 32 , time: 5 length: 1741 Items: 64 , time: 2 length: 3501 Items: 128 , time: 4 length: 7106 Items: 256 , time: 8 length: 14530 Items: 512 , time: 12 length: 29378 Items: 1024 , time: 24 length: 59147 Items: 2048 , time: 38 length: 121611 Items: 4096 , time: 71 length: 246539 Items: 8192 , time: 131 length: 496395 Items: 16384 , time: 350 length: 1015260 Items: 32768 , time: 573 length: 2063836 Items: 65536 , time: 1263 length: 4160988 Items: 131072 , time: 2143 length: 8448509 Items: 262144 , time: 4859 length: 17230333 ``` to ``` Items: 2 , time: 4 length: 114 Items: 4 , time: 3 length: 218 Items: 8 , time: 9 length: 426 Items: 16 , time: 1 length: 861 Items: 32 , time: 5 length: 1741 Items: 64 , time: 1 length: 3501 Items: 128 , time: 3 length: 7106 Items: 256 , time: 7 length: 14530 Items: 512 , time: 9 length: 29378 Items: 1024 , time: 17 length: 59147 Items: 2048 , time: 30 length: 121611 Items: 4096 , time: 61 length: 246539 Items: 8192 , time: 113 length: 496395 Items: 16384 , time: 307 length: 1015260 Items: 32768 , time: 443 length: 2063836 Items: 65536 , time: 1065 length: 4160988 Items: 131072 , time: 1799 length: 8448509 Items: 262144 , time: 4217 length: 17230333 ```
There's a bit of refactoring in here to get this all to work, but mostly it's the last two commits that are the meat of it:
Our old logic constantly referenced the output value, meaning every reference got more expensive as the output for larger, leading to an exponential increase in codegen times.
For example, using the example script from istanbuljs/babel-plugin-istanbul#5 (comment) which generates doubling AST array, generation time changes as shown:
Items: the size of the array being generated
Time: The time in ms to generate the code
Length: The number of characters in the output code
to