Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid codesize impact of the pushing in PropertyDeclaration::parse #14950

Closed
wants to merge 2 commits into from

Conversation

Manishearth
Copy link
Member

@Manishearth Manishearth commented Jan 10, 2017

While investigating https://bugzilla.mozilla.org/show_bug.cgi?id=1297322#c19, I realized that the asm generated for this function in release mode is abominable. LLVM, always wanting to please, inlines a bajillion things resulting in 100k lines of ASM with a lot of redundant bits. We have a thousand calls to alloc::oom at the end of the function alone. I'm told that LLVM doesn't hoist things out of switches that well, which might be the case here. The only common allocation here is the pushing (parsing may allocate but that's not common).

I thought I'd hoist the allocation calls out. All the longhands can share a single push() call.

Furthermore, the shorthands have a bunch of sequential push calls for CSS-wide keywords. I'm not sure how well LLVM optimizes those, but we should be reserve()ing early anyway; pretty sure reserve(n) followed by n push()es when inlined will make the push a trivial pointer-bump.

There's a further optimization which I have yet to implement that @dotdash pointed out that will let me hoist the push calls for all shorthands into one (not counting any pushes done by shorthand parsing). This should have no perf impact but reduce code size further.

I haven't yet measured the impact here (@mbrubeck if you have time would you be able to check how this impacts XUL?), but will do so later today.

r? @mbrubeck


This change is Reviewable

@highfive
Copy link

Heads up! This PR modifies the following files:

  • @bholley: components/style/properties/properties.mako.rs
  • @emilio: components/style/properties/properties.mako.rs

@highfive highfive added the S-awaiting-review There is new code that needs to be reviewed. label Jan 10, 2017
@highfive
Copy link

warning Warning warning

  • These commits modify style code, but no tests are modified. Please consider adding a test!

@SimonSapin
Copy link
Member

Looks good to me.

@bors-servo r+

@bors-servo
Copy link
Contributor

📌 Commit 77e132b has been approved by SimonSapin

@highfive highfive assigned SimonSapin and unassigned mbrubeck Jan 10, 2017
@highfive highfive added S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. and removed S-awaiting-review There is new code that needs to be reviewed. labels Jan 10, 2017
@bors-servo
Copy link
Contributor

⌛ Testing commit 77e132b with merge a5128a5...

bors-servo pushed a commit that referenced this pull request Jan 10, 2017
Avoid codesize impact of the pushing in PropertyDeclaration::parse

While investigating https://bugzilla.mozilla.org/show_bug.cgi?id=1297322#c19, i realized that the asm generated for this function in release mode is abominable. LLVM, always wanting to please, inlines a bajillion things resulting in 100k lines of ASM with a lot of redundant bits. We have thousand calls to `alloc::oom` at the end of the function alone. I'm told that LLVM doesn't hoist things out of switches that well, which might be the case here. The only common allocation here is the pushing (parsing may allocate but that's not common).

I thought I'd hoist the allocation calls out. All the longhands can share a single `push()` call.

Furthermore, the shorthands have a bunch of sequential push calls for CSS-wide keywords. I'm not sure how well LLVM optimizes those, but we should be `reserve()`ing early anyway; pretty sure `reserve(n)` followed by n `push()`es when inlined will make the push a trivial pointer-bump.

There's a further optimization which I have yet to implement that @dotdash pointed out that will let me hoist the push calls for all shorthands into one (not counting any pushes done by shorthand parsing). This should have no perf impact but reduce code size further.

I haven't yet measured the impact here (@mbrubeck if you have time would you be able to check how this impacts XUL?), but will do so later today.

r? @mbrubeck

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/14950)
<!-- Reviewable:end -->
@Manishearth
Copy link
Member Author

@bors-servo r-

not yet 😄

@mbrubeck
Copy link
Contributor

In a build of the incubator/stylo repo at changeset 67109dc09a80, the first patch here decreases the size of PropertyDeclaration::parse by about 1 KiB, but increases total libxul text segment size by 320 bytes (compared to the same changeset without no patches applied).

The two patches together increase both PropertyDeclaration::parse and total libxul text segment size by about 6 KiB.

@Manishearth
Copy link
Member Author

@dotdash any idea why the above would happen? (libxul is the binary this gets compiled into. debug is off)

@bors-servo
Copy link
Contributor

@Ms2ger
Copy link
Contributor

Ms2ger commented Jan 10, 2017

This certainly should have a comment about the reason for the somewhat curious code flow

@Manishearth
Copy link
Member Author

It will, if it can be made to actually work 😄

@jdm jdm added S-needs-code-changes Changes have not yet been made that were requested by a reviewer. and removed S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. labels Jan 27, 2017
@jdm
Copy link
Member

jdm commented Jan 27, 2017

What's the plan for this PR?

@Manishearth
Copy link
Member Author

It didn't improve anything, and the .reserve() only has a rare perf benefit (nobody uses CSS keywords in shorthands, and if they do a couple of reallocs is nbd), so closing.

@Manishearth Manishearth deleted the hoist-pushing branch January 27, 2017 20:33
@mbrubeck
Copy link
Contributor

@Manishearth, are there (or should there be) any bugs filed on rustc or LLVM for things they could do better here?

@Manishearth
Copy link
Member Author

.reserve() plus a bunch of .push()es should be inlined+optimized into a single bounds check/realloc and a bunch of memmoves. It seems like currently the realloc/bounds check code isn't removed in case of the pushes. No idea how hard this is to fix.

@dotdash @eddyb thoughts?

@SimonSapin
Copy link
Member

If the number of push calls is known statically (i.e. there’s no loop), could this whole pattern be replaced with one extend_from_slice call?

@Manishearth
Copy link
Member Author

Moved to #15279

(can't reuse PR after the branch has been deleted)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-needs-code-changes Changes have not yet been made that were requested by a reviewer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants