-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor tree-shaking rendering #1949
Conversation
Great work! Happy to review when ready. |
exceptions for IfStatements and ExportDefaultDeclarations for now
old logic but can leave trailing comments and line-breaks as the SwitchCase length provided by acorn does not include those)
* Skip unnecessary calculations
* Move responsibility to suppress semicolon to parent
next—even though it can produce less nice code—as otherwise, it was not guaranteed each single line comment was properly closed
e6d1a29
to
6b30972
Compare
before a line-break with the previous declarator
From my side this is ready for review! I hope I did the system bindings correctly as I rewrote this logic simply by looking at the previous output 😳 |
Sure, will get to it. Just testing #1937, if I alter the example to use destructuring I'm getting the statement removed incorrectly: const FONT_HINGE_SIZE = 0.07;
const BACK_HINGE_SIZE = 0.16;
function PointerGauge () {
const { backHinge, frontHinge } = { backHinge: BACK_HINGE_SIZE, frontHinge: FONT_HINGE_SIZE };
return [
//backHinge,
fontHinge
]
}
export default PointerGauge; strips the whole destructuring statement above. |
Ahh, no that was just spelling - seems to be working fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I really like the direction this rendering logic is going.
In a test on a large codebase I'm getting about a 5 - 10% slowdown, do you think we could try and focus on some performance aspects here for landing?
I've added some suggestions to reduce allocations, but will have to see if that does much.
I have tested on Windows here and all is fine, but even so it may be worth also checking that all of the line break logic correctly handles \r
characters too, as it seems quite \n
specific currently.
src/ast/nodes/DoWhileStatement.ts
Outdated
type: NodeType.DoWhileStatement; | ||
body: StatementNode; | ||
body: Node; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure we need to get rid of the concept of a StatementNode here? Quite nice to match the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It serves no purpose but I can put in StatementNode
as an alias for Node
so that it is there if we need it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh originally I had the compound ESTree types as representing a union of their members so there was some typing benefit here on the generic interfaces. But I remember now you refactored the Node methods to no longer rely on these union properties, which is cleaner anyway. But perhaps nice to at least keep some level of the semantic if it turns out there are places where this typing information might be useful to reconsider in future.
const declarationEnd = findFirstOccurrenceOutsideComment(code, declarationKeyword) + declarationKeyword.length; | ||
code = code.slice(declarationEnd); | ||
code = code.slice(0, findFirstOccurrenceOutsideComment(code, '{')); | ||
const generatorStarPos = findFirstOccurrenceOutsideComment(code, '*'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor performance suggestion here, although its probably trivial, but instead of creating a new string, the findFirstOccurrenceOutsideComment
method could take an offset argument like the standard string indexOf function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my shame I must admit I was not actually aware of the second parameter of String.indexOf
😳Yes, of course it makes sense to use it here and in other places. I see how I can rewrite this to reduce the number of slices.
src/utils/renderHelpers.ts
Outdated
let commentStart, commentLength, lineBreakPos; | ||
while (true) { | ||
commentStart = code.indexOf('/'); | ||
lineBreakPos = (~commentStart ? code.slice(0, commentStart) : code).indexOf(searchString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly these slice calls could be refactored to use an index into the indexOf
method.
code = code.slice(declarationEnd); | ||
code = code.slice(0, findFirstOccurrenceOutsideComment(code, '{')); | ||
const generatorStarPos = findFirstOccurrenceOutsideComment(code, '*'); | ||
if (~generatorStarPos) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I can handle this convention anymore :P
When using binary operators like this in JavaScript, it means the 64-bit JavaScript float is converted into a 32-bit twos-complement integer with rounding, then the not operation is applied.
In addition these can fail for large numbers when the 32 bit limit gets reached, which can still be in the safe operating range of the integer values of a 64 bit float (although the ranges are well out of what Rollup will ever see).
generatorStarPos === 0
is explicit and a single comparison operation in the final assembly over all the above multi-step complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must admit I do not really like this either but running all different versions of checking that indexOf
(or something similar) returned a result through jsben.ch on Chrome, the tilde operator was always fastest
Ok, running with a larger sample set and a slightly different test, the tilde operator actually seems to be on par with !== -1
so I'll use that.
src/utils/renderHelpers.ts
Outdated
let commentStart, commentLength, lineBreakPos; | ||
while (true) { | ||
commentStart = code.indexOf('/'); | ||
lineBreakPos = (~commentStart ? code.slice(0, commentStart) : code).indexOf(searchString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really lineBreakPos
or rather searchPos
?
this.declaration.render(code, options); | ||
render (code: MagicString, options: RenderOptions, { start, end }: NodeRenderOptions = {}) { | ||
if (this.declaration === null) { | ||
code.remove(start || this.start, end || this.end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When will start not be defined here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -63,9 +59,7 @@ export default class BlockStatement extends StatementBase { | |||
|
|||
render (code: MagicString, options: RenderOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should make start
and end
mandatory arguments for all render calls, defaulting to start = this.start
?
render (code: MagicString, _options: RenderOptions) { | ||
code.remove(this.leadingCommentStart || this.start, this.next || this.end); | ||
render (code: MagicString, _options: RenderOptions, { start, end }: NodeRenderOptions = {}) { | ||
code.remove(start || this.start, end || this.end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having start and end as arguments with defaults would avoid these scenarios too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As defaults work with object patterns as well, this is certainly an improvement
src/ast/nodes/VariableDeclaration.ts
Outdated
this.end - (code.original[this.end - 1] === ';' ? 1 : 0) | ||
); | ||
let renderedContentEnd; | ||
if (/\n\s*$/.test(code.slice(this.start, separatedNodes[0].start))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\r
?
src/ast/nodes/VariableDeclaration.ts
Outdated
} | ||
leadingString = ''; | ||
nextSeparatorString = ''; | ||
if (isIdentifier(node.id) && isReassignedPartOfExportsObject(node.id.variable)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps isReassignedAsExportsMember
? Object
as a term isn't descriptive enough for me to immediately think of a member expression form without further prompting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I am pretty sure the reason for the worse performance is that now each statement in every module and block statement has its boundaries calculated though only very few really need this. This was my naive implementation but seeing this, I will try to change the algorithm to only calculate boundaries when they are actually needed (since not running code is always the best performance improvement...). |
* Get rid of unnecessary defaults * Replace tilde operator
Ok, I did not do everything yet (especially line-breaks still assume "\n") but
I expect especially the latter to provide a very noticeable performance boost. |
With the more accurate timing I'm only seeing a slowdown now from around 260ms for render to 340ms in a large tree. The string search times seems to make sense in that well-written string search algorithms offer better than O(n) complexity, while taking an initial hit in additional mechanics setup. I've created a PR to get the performance timings working again. Perhaps these can be watched closely as any further changes are made... so hard to speculate here. |
Nice to hear that render is still rather fast. I have got some ideas on how to skip boundary detection for variable declarations when not necessary that should further noticeably speed up rendering. Let's see what we get to in the end. |
The last commit seems to have added a good 20ms or so speedup. Funny how it doesn't look like it does much. The performance costs seem pretty negligible at this point so count me approved... just worth making a habit of checking it for all compiler work I think. |
@@ -22,4 +24,16 @@ export default class SwitchCase extends NodeBase { | |||
}); | |||
return addedNewNodes; | |||
} | |||
|
|||
render (code: MagicString, options: RenderOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this render function not have a third argument? Just wondering how much the polymorphism might bite us in performance as predictable singular argument signatures are important to optimizations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for each SwitchCase, the signature is the same. As far as I know, call signatures are more important. Might be reasonable to consider adding "undefined" as a third parameter when calling render instead. But maybe we wait until we got rollup fit for the V8 web tooling benchmark so that we can talk to the actual experts here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since a switch case is a statement I was thinking of a call site like the render call from renderStatementList
, although I see this is distinguished by needsBoundaries
now.
* Properly handle \r\n line-breaks
Ok, this is again ready for review. I added a few smaller improvements i.e. charCodeAt instead of direct character access and just counting up an index instead of searching in some situations. Most importantly, the expensive variable declaration rendering algorithm will be skipped if none of its transformations are necessary (which is probably the reason for the 20ms speedup). As for line-breaks, I decided to ignore the uncommon "\r"-only line-breaks (the result will still work, it will just not be as pretty). As for "\r\n" line-breaks, those should have already worked in most situations out of the box except in situations where we remove a line-break. This is now accounted for. |
One super minor nit in testing:
Returns |
// discover this module's imports and exports | ||
let lastNode: Node; | ||
|
||
enhance(this.ast, this, this.dynamicImports); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're no longer using this.comments
does that mean this property can be removed entirely, including the use of the onComment
hook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested this and the answer is yes :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not as it is still required to remove source map comments. This, however, is the only reason why we keep this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh. Can you point me to the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe at some point we should implement our own algorithm to scan for and remove those comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. A sourceMappingURL comment can only be a block comment so it's fine to do something custom. Also it can be based on lastIndexOf
as a backwards search through the string would be quicker.
Here's the code I have used for this previously - https://github.com/systemjs/builder/blob/master/lib/sourcemaps.js#L12.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this example, maybe the issue is not that trivial and we should probably rather trust acorns parsing here and leave it as it is:
var str = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise we might break the sourceMap handling of other libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's v8's method to do this - https://github.com/v8/v8/blob/8d38c15e04598f9e48e6230327da0e41a2445957/src/inspector/search-util.cc#L16
Note that it doesn't use a lexer at all.
I think it would be much simpler to make this code its own method and deprecate this.comments
entirely. But happy to make this a separate PR too if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this further detection is a less stringent problem than removal - and that’s what I got confused here. It makes sense that Rollup has to hold to a higher standard for removal not to break code.
Perhaps we can inline the removal into the parsing code without separately gathering the comments at least.
Including the removal of the Good to merge with those changes. |
Just to be clear, this seems good to land anytime to me, given that the points mentioned are no longer relevant. |
Nice! Depending on what else is lying around here, I guess I'll try to assemble a release tomorrow! |
This PR is now ready to be reviewed.
To try it out, install rollup via
Issues resolved by this PR:
This will refactor core parts of the rendering logic especially with regard to tree-shaking and comment-handling. Previously, tree-shaking was managed by the rendering logic of the
Statement
base node, i.e. nodes that were not included would remove themselves from the tree. This often lead to unwanted white-space and sup-optimal results with regard to comments.Instead, the parent node is now responsible for not rendering its children based on their
.included
flag.To that end, modules, block scopes and switch cases now use a new optimized statement border detection algorithm. For block statements, it will e.g. look like this:
i.e.
For modules this means that any comment starting on the first line of the module will be retained as a header comment while all subsequent comments belong to the first statement.
Also, the logic for variable declarations was thoroughly rewritten to provide, in my opinion, the best possible output in most situations, e.g.
This also affects the way named export declarations are rendered.
The only thing I am not 100% happy about is the complexity of the variable declaration rendering function. Any suggestions for e.g. extracting functionality here are very welcome.