Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid picture primitive copies via VecHelper #3362

Merged
merged 1 commit into from Nov 28, 2018

Conversation

@kvark
Copy link
Member

kvark commented Nov 27, 2018

This is a successor of #3360 that avoids the borrow checker dance via RAII
Addresses part of #3358


This change is Reviewable

@kvark kvark requested a review from jrmuizel Nov 27, 2018
// Matches the definition of SK_ScalarNearlyZero in Skia.
const NEARLY_ZERO: f32 = 1.0 / 4096.0;

/// A typesafe helper that separates new value construction from
/// vector growing, which allows LLVM to elide the value copy.

This comment has been minimized.

@jrmuizel

jrmuizel Nov 27, 2018

Contributor

"and ideally construct the element in place"


impl<'a, T> Allocation<'a, T> {
#[inline(always)]
pub fn init(self, value: T) -> usize {

This comment has been minimized.

@jrmuizel

jrmuizel Nov 27, 2018

Contributor

Maybe add:
// writing is safe because alloc() ensured enough capacity and Allocation holds a mutable borrow to prevent anyone else from breaking this invariant.

This comment has been minimized.

@emilio

emilio Nov 28, 2018

Member

Are you sure this actually avoids the memcpy / memmove? At least for self arguments a bit back we couldn't work around it like this, see rust-lang/rust#42763.

This comment has been minimized.

@kvark

kvark Nov 28, 2018

Author Member

Confirmed in playground. In fn foo() changing the push(xxx) to alloc().init(xxx) removes the memcpy.

This comment has been minimized.

@jrmuizel

jrmuizel Nov 29, 2018

Contributor

I filed rust-lang/rust#56333 for a problem contributing to the first example not working well in the playground.

This comment has been minimized.

@jrmuizel

jrmuizel Nov 29, 2018

Contributor

It looks like SmallVec also causes this to not work. I filed rust-lang/rust#56356 about that.

@@ -338,7 +338,7 @@ impl FrameBuilder {
let mut profile_counters = FrameProfileCounters::new();
profile_counters
.total_primitives
.set(self.prim_store.prim_count);
.set(self.prim_store.prim_count());

This comment has been minimized.

@jrmuizel

jrmuizel Nov 27, 2018

Contributor

It'd be nice if this only ran if profiling was on.

@kvark kvark force-pushed the kvark:picture-copies2 branch from 83b830f to 18455ee Nov 28, 2018
@kvark kvark changed the title Avoid picture primitive copies via VecHelper [WIP] Avoid picture primitive copies via VecHelper Nov 28, 2018
@kvark
Copy link
Member Author

kvark commented Nov 28, 2018

I'm not able to use @jrmuizel 's tool that find memcopies, since IR validation fails on webrender crate. However, I tried to reproduce this in playground, and in both cases (vec.push(x) and vec.alloc().init(x)) the compiler created the data on the stack first and then issued a single memcpy, so it doesn't appear to be fixing the copy.

Is my playground experiment incorrect? Otherwise, what assumptions are wrong?

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2018

The latest upstream changes (presumably #3359) made this pull request unmergeable. Please resolve the merge conflicts.

@jrmuizel
Copy link
Contributor

jrmuizel commented Nov 28, 2018

@kvark
Copy link
Member Author

kvark commented Nov 28, 2018

@jrmuizel indeed, I confirmed it with the playground experiment.
Therefore, we should proceed with this PR.

@kvark kvark force-pushed the kvark:picture-copies2 branch from 18455ee to d680e32 Nov 28, 2018
@kvark kvark changed the title [WIP] Avoid picture primitive copies via VecHelper Avoid picture primitive copies via VecHelper Nov 28, 2018
@kvark
Copy link
Member Author

kvark commented Nov 28, 2018

@bors-servo r=jrmuizel

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2018

📌 Commit d680e32 has been approved by jrmuizel

@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2018

Testing commit d680e32 with merge be65949...

bors-servo added a commit that referenced this pull request Nov 28, 2018
Avoid picture primitive copies via VecHelper

This is a successor of #3360 that avoids the borrow checker dance via RAII
Addresses part of #3358

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3362)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Nov 28, 2018

☀️ Test successful - status-appveyor, status-taskcluster
Approved by: jrmuizel
Pushing be65949 to master...

@bors-servo bors-servo merged commit d680e32 into servo:master Nov 28, 2018
2 of 3 checks passed
2 of 3 checks passed
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
Taskcluster (pull_request) TaskGroup: success
Details
homu Test successful
Details
@kvark kvark deleted the kvark:picture-copies2 branch Nov 28, 2018
@jrmuizel
Copy link
Contributor

jrmuizel commented Nov 28, 2018

It looks like this may not have worked:

memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener20pop_stacking_context17hc7e8ce6ec929e9eaE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_stacking_context webrender/src/display_list_flattener.rs:1220
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener20pop_stacking_context17hc7e8ce6ec929e9eaE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_stacking_context webrender/src/display_list_flattener.rs:1187
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener21push_stacking_context17h500d71065754d89aE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  cut_flat_item_sequence webrender/src/display_list_flattener.rs:2207
  push_stacking_context webrender/src/display_list_flattener.rs:994
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener20pop_stacking_context17hc7e8ce6ec929e9eaE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_stacking_context webrender/src/display_list_flattener.rs:1158
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener12flatten_item17hfd2b2318f8c75547E @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_all_shadows webrender/src/display_list_flattener.rs:1564
  flatten_item webrender/src/display_list_flattener.rs:789
memcpy of 1592 in _ZN9webrender22display_list_flattener20DisplayListFlattener20pop_stacking_context17hc7e8ce6ec929e9eaE @
  write<webrender::picture::PicturePrimitive> /rustc/6f93e93af6f823948cc13d2938957757c6486d88/src/libcore/ptr.rs:739
  init<webrender::picture::PicturePrimitive> webrender/src/util.rs:33
  pop_stacking_context webrender/src/display_list_flattener.rs:1117

I wonder if llvm gets sad the enum variant is too big.

@jrmuizel
Copy link
Contributor

jrmuizel commented Nov 29, 2018

It looks like SmallVec is contributing to the sadness:

#![crate_type = "lib"]

extern crate smallvec;

use smallvec::SmallVec;

#[derive(Default)]
pub struct L {
    a: SmallVec<[f64; 16]>,
    b: SmallVec<[f64; 16]>,
    c: SmallVec<[f64; 16]>
}

pub struct Allocation<T> {
    f: *mut T,
}
use std::ptr;

impl<T> Allocation<T> {
    pub fn init(self, value: T) {
        unsafe { ptr::write(self.f, value) };
    }
}

#[inline(never)]
pub fn bar(a: Allocation<L>) {
    a.init(L{a: SmallVec::new(), b: SmallVec::new(), c: SmallVec::new()});
}

compiles to

playground::bar:
	pushq	%rbx
	subq	$720, %rsp
	movq	%rdi, %rbx
	xorps	%xmm0, %xmm0
	movaps	%xmm0, 288(%rsp)
	movaps	%xmm0, (%rsp)
	movaps	%xmm0, 144(%rsp)
	leaq	432(%rsp), %rdi
	movq	%rsp, %rsi
	movl	$144, %edx
	callq	memcpy@PLT
	leaq	576(%rsp), %rdi
	leaq	144(%rsp), %rsi
	movl	$144, %edx
	callq	memcpy@PLT
	leaq	288(%rsp), %rsi
	movl	$432, %edx
	movq	%rbx, %rdi
	callq	memcpy@PLT
	addq	$720, %rsp
	popq	%rbx
	retq
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Nov 29, 2018
…4c9170a70ef7 (WR PR #3362). r=kats

servo/webrender#3362

Differential Revision: https://phabricator.services.mozilla.com/D13355

--HG--
extra : moz-landing-system : lando
jankeromnes pushed a commit to jankeromnes/gecko that referenced this pull request Nov 29, 2018
bors-servo added a commit that referenced this pull request Nov 30, 2018
Reduce data copies during internation

Excessive copying in `fn intern()` is something we noticed yesterday with @jrmuizel .
This PR attempts to improve them in two ways (each in a separate commit):
  1. reduce the `Update` enum size by moving the data out
  2. adopt entry-like API to accommodate the common pattern of `if index == v.len() { v.push(xxx) } else { v[index] = xxx; }` without panics between element construction and actual assignment. It builds upon the `VecHelper` work of #3362.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3366)
<!-- Reviewable:end -->
@jrmuizel
Copy link
Contributor

jrmuizel commented Feb 2, 2019

The smallvec example now generates better code with Nightly:

playground::bar: # @playground::bar
# %bb.0:
	subq	$440, %rsp              # imm = 0x1B8
	xorps	%xmm0, %xmm0
	movaps	%xmm0, (%rsp)
	movaps	%xmm0, 144(%rsp)
	movaps	%xmm0, 288(%rsp)
	movq	%rsp, %rsi
	movl	$432, %edx              # imm = 0x1B0
	callq	*memcpy@GOTPCREL(%rip)
	addq	$440, %rsp              # imm = 0x1B8
	retq

However it looks like there's still an extra memcpy

@jrmuizel
Copy link
Contributor

jrmuizel commented Feb 2, 2019

I filed a follow up at rust-lang/rust#58082

gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Oct 3, 2019
…4c9170a70ef7 (WR PR #3362). r=kats

servo/webrender#3362

Differential Revision: https://phabricator.services.mozilla.com/D13355

UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Oct 3, 2019
…4c9170a70ef7 (WR PR #3362). r=kats

servo/webrender#3362

Differential Revision: https://phabricator.services.mozilla.com/D13355

UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Oct 3, 2019
…4c9170a70ef7 (WR PR #3362). r=kats

servo/webrender#3362

Differential Revision: https://phabricator.services.mozilla.com/D13355

UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.