Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upAvoid picture primitive copies via VecHelper #3362
Conversation
| // Matches the definition of SK_ScalarNearlyZero in Skia. | ||
| const NEARLY_ZERO: f32 = 1.0 / 4096.0; | ||
|
|
||
| /// A typesafe helper that separates new value construction from | ||
| /// vector growing, which allows LLVM to elide the value copy. |
This comment has been minimized.
This comment has been minimized.
|
|
||
| impl<'a, T> Allocation<'a, T> { | ||
| #[inline(always)] | ||
| pub fn init(self, value: T) -> usize { |
This comment has been minimized.
This comment has been minimized.
jrmuizel
Nov 27, 2018
Contributor
Maybe add:
// writing is safe because alloc() ensured enough capacity and Allocation holds a mutable borrow to prevent anyone else from breaking this invariant.
This comment has been minimized.
This comment has been minimized.
emilio
Nov 28, 2018
Member
Are you sure this actually avoids the memcpy / memmove? At least for self arguments a bit back we couldn't work around it like this, see rust-lang/rust#42763.
This comment has been minimized.
This comment has been minimized.
kvark
Nov 28, 2018
Author
Member
Confirmed in playground. In fn foo() changing the push(xxx) to alloc().init(xxx) removes the memcpy.
This comment has been minimized.
This comment has been minimized.
jrmuizel
Nov 29, 2018
Contributor
I filed rust-lang/rust#56333 for a problem contributing to the first example not working well in the playground.
This comment has been minimized.
This comment has been minimized.
jrmuizel
Nov 29, 2018
Contributor
It looks like SmallVec also causes this to not work. I filed rust-lang/rust#56356 about that.
| @@ -338,7 +338,7 @@ impl FrameBuilder { | |||
| let mut profile_counters = FrameProfileCounters::new(); | |||
| profile_counters | |||
| .total_primitives | |||
| .set(self.prim_store.prim_count); | |||
| .set(self.prim_store.prim_count()); | |||
This comment has been minimized.
This comment has been minimized.
|
I'm not able to use @jrmuizel 's tool that find memcopies, since IR validation fails on Is my playground experiment incorrect? Otherwise, what assumptions are wrong? |
|
|
|
If you use an enum containing an array instead of an array and initialize
with a small enum variant the copy is elided in your example
|
|
@jrmuizel indeed, I confirmed it with the playground experiment. |
|
@bors-servo r=jrmuizel |
|
|
Avoid picture primitive copies via VecHelper This is a successor of #3360 that avoids the borrow checker dance via RAII Addresses part of #3358 <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3362) <!-- Reviewable:end -->
|
|
|
It looks like this may not have worked:
I wonder if llvm gets sad the enum variant is too big. |
|
It looks like SmallVec is contributing to the sadness: #![crate_type = "lib"]
extern crate smallvec;
use smallvec::SmallVec;
#[derive(Default)]
pub struct L {
a: SmallVec<[f64; 16]>,
b: SmallVec<[f64; 16]>,
c: SmallVec<[f64; 16]>
}
pub struct Allocation<T> {
f: *mut T,
}
use std::ptr;
impl<T> Allocation<T> {
pub fn init(self, value: T) {
unsafe { ptr::write(self.f, value) };
}
}
#[inline(never)]
pub fn bar(a: Allocation<L>) {
a.init(L{a: SmallVec::new(), b: SmallVec::new(), c: SmallVec::new()});
}compiles to playground::bar:
pushq %rbx
subq $720, %rsp
movq %rdi, %rbx
xorps %xmm0, %xmm0
movaps %xmm0, 288(%rsp)
movaps %xmm0, (%rsp)
movaps %xmm0, 144(%rsp)
leaq 432(%rsp), %rdi
movq %rsp, %rsi
movl $144, %edx
callq memcpy@PLT
leaq 576(%rsp), %rdi
leaq 144(%rsp), %rsi
movl $144, %edx
callq memcpy@PLT
leaq 288(%rsp), %rsi
movl $432, %edx
movq %rbx, %rdi
callq memcpy@PLT
addq $720, %rsp
popq %rbx
retq |
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355 --HG-- extra : moz-landing-system : lando
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355
Reduce data copies during internation Excessive copying in `fn intern()` is something we noticed yesterday with @jrmuizel . This PR attempts to improve them in two ways (each in a separate commit): 1. reduce the `Update` enum size by moving the data out 2. adopt entry-like API to accommodate the common pattern of `if index == v.len() { v.push(xxx) } else { v[index] = xxx; }` without panics between element construction and actual assignment. It builds upon the `VecHelper` work of #3362. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3366) <!-- Reviewable:end -->
|
The smallvec example now generates better code with Nightly: playground::bar: # @playground::bar
# %bb.0:
subq $440, %rsp # imm = 0x1B8
xorps %xmm0, %xmm0
movaps %xmm0, (%rsp)
movaps %xmm0, 144(%rsp)
movaps %xmm0, 288(%rsp)
movq %rsp, %rsi
movl $432, %edx # imm = 0x1B0
callq *memcpy@GOTPCREL(%rip)
addq $440, %rsp # imm = 0x1B8
retqHowever it looks like there's still an extra memcpy |
|
I filed a follow up at rust-lang/rust#58082 |
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355 UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355 UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355 UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
kvark commentedNov 27, 2018
•
edited by larsbergstrom
This is a successor of #3360 that avoids the borrow checker dance via RAII
Addresses part of #3358
This change is