New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid codesize impact of the pushing in PropertyDeclaration::parse #14950
Conversation
Looks good to me. @bors-servo r+ |
📌 Commit 77e132b has been approved by |
Avoid codesize impact of the pushing in PropertyDeclaration::parse While investigating https://bugzilla.mozilla.org/show_bug.cgi?id=1297322#c19, i realized that the asm generated for this function in release mode is abominable. LLVM, always wanting to please, inlines a bajillion things resulting in 100k lines of ASM with a lot of redundant bits. We have thousand calls to `alloc::oom` at the end of the function alone. I'm told that LLVM doesn't hoist things out of switches that well, which might be the case here. The only common allocation here is the pushing (parsing may allocate but that's not common). I thought I'd hoist the allocation calls out. All the longhands can share a single `push()` call. Furthermore, the shorthands have a bunch of sequential push calls for CSS-wide keywords. I'm not sure how well LLVM optimizes those, but we should be `reserve()`ing early anyway; pretty sure `reserve(n)` followed by n `push()`es when inlined will make the push a trivial pointer-bump. There's a further optimization which I have yet to implement that @dotdash pointed out that will let me hoist the push calls for all shorthands into one (not counting any pushes done by shorthand parsing). This should have no perf impact but reduce code size further. I haven't yet measured the impact here (@mbrubeck if you have time would you be able to check how this impacts XUL?), but will do so later today. r? @mbrubeck <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/14950) <!-- Reviewable:end -->
@bors-servo r- not yet 😄 |
In a build of the incubator/stylo repo at changeset 67109dc09a80, the first patch here decreases the size of The two patches together increase both |
@dotdash any idea why the above would happen? (libxul is the binary this gets compiled into. debug is off) |
☀️ Test successful - android, arm32, arm64, linux-dev, linux-rel-css, linux-rel-wpt, mac-dev-unit, mac-rel-css, mac-rel-wpt1, mac-rel-wpt2, windows-gnu-dev, windows-msvc-dev |
This certainly should have a comment about the reason for the somewhat curious code flow |
It will, if it can be made to actually work 😄 |
What's the plan for this PR? |
It didn't improve anything, and the |
@Manishearth, are there (or should there be) any bugs filed on rustc or LLVM for things they could do better here? |
If the number of |
Moved to #15279 (can't reuse PR after the branch has been deleted) |
While investigating https://bugzilla.mozilla.org/show_bug.cgi?id=1297322#c19, I realized that the asm generated for this function in release mode is abominable. LLVM, always wanting to please, inlines a bajillion things resulting in 100k lines of ASM with a lot of redundant bits. We have a thousand calls to
alloc::oom
at the end of the function alone. I'm told that LLVM doesn't hoist things out of switches that well, which might be the case here. The only common allocation here is the pushing (parsing may allocate but that's not common).I thought I'd hoist the allocation calls out. All the longhands can share a single
push()
call.Furthermore, the shorthands have a bunch of sequential push calls for CSS-wide keywords. I'm not sure how well LLVM optimizes those, but we should be
reserve()
ing early anyway; pretty surereserve(n)
followed by npush()
es when inlined will make the push a trivial pointer-bump.There's a further optimization which I have yet to implement that @dotdash pointed out that will let me hoist the push calls for all shorthands into one (not counting any pushes done by shorthand parsing). This should have no perf impact but reduce code size further.
I haven't yet measured the impact here (@mbrubeck if you have time would you be able to check how this impacts XUL?), but will do so later today.
r? @mbrubeck
This change is