Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profiling for IPC overhead #1123

Merged
merged 1 commit into from Apr 17, 2017
Merged
Changes from all commits
Commits
File filter...
Filter file types
Jump to…
Jump to file
Failed to load files.

Always

Just for now

Add profiling for IPC overhead

Note: this adds a dependency on `time` to `webrender_traits`.

The purpose of this is to evaluate the overhead of different IPC implementations and serialization strategies. In particular, we need to replace the current implementation, as it is unsound; it just blindly transmutes, leaving room for a malicious process to create Undefined representations of e.g. enums.

We take 4 time stamps:

* before serialization to `Vec<u8>` happens
* after serialization
* before deserialization
* after deserialization

Because some of these time stamps happen in webrender_traits and cross-process, we need to take them manually and send the first two over IPC. Currently I'm just stuffing them in one of the descriptors, since that's easy and minimally invasive (but Firefox's headers will need to be updated, I think).

We consider the time to perform the message send cross-process to be the time between serialization ending and deserialization starting. Overall IPC overhead is the whole process. We also record how large the two sent buffers are, as this may change significantly with a Proper Serialization Mechanism.

In my preliminary testing basically all overhead is currently in the actual message send process (about 0.5ms on Mac in Servo). There's sometimes some overhead from deserialization, which is probably stuffing the auxlist in a map. About 200k is being sent for the display list, and 100k for the aux list.
  • Loading branch information
Gankra committed Apr 17, 2017
commit cb0bc286962ce8785a18ef7367ce869e4a334a90
@@ -287,19 +287,58 @@ impl TextureCacheProfileCounters {

#[derive(Clone)]
pub struct BackendProfileCounters {
pub total_time: TimeProfileCounter,
pub resources: ResourceProfileCounters,
pub ipc: IpcProfileCounters,
}

#[derive(Clone)]
pub struct ResourceProfileCounters {
pub font_templates: ResourceProfileCounter,
pub image_templates: ResourceProfileCounter,
pub total_time: TimeProfileCounter,
pub texture_cache: TextureCacheProfileCounters,
}

#[derive(Clone)]
pub struct IpcProfileCounters {
pub serialize_time: TimeProfileCounter,
pub deserialize_time: TimeProfileCounter,
pub send_time: TimeProfileCounter,
pub total_time: TimeProfileCounter,
pub display_len: IntProfileCounter,
pub aux_len: IntProfileCounter,
}

impl IpcProfileCounters {
pub fn set(&mut self, serial_start: u64, serial_end: u64,
deserial_start: u64, deserial_end: u64,
display_len: usize, aux_len: usize) {
self.serialize_time.set(serial_end - serial_start);
self.deserialize_time.set(deserial_end - deserial_start);
self.send_time.set(deserial_start - serial_end);
self.total_time.set(deserial_end - serial_start);
self.display_len.set(display_len);
self.aux_len.set(aux_len);
}
}

impl BackendProfileCounters {
pub fn new() -> BackendProfileCounters {
BackendProfileCounters {
font_templates: ResourceProfileCounter::new("Font Templates"),
image_templates: ResourceProfileCounter::new("Image Templates"),
total_time: TimeProfileCounter::new("Backend CPU Time", false),
texture_cache: TextureCacheProfileCounters::new(),
resources: ResourceProfileCounters {
font_templates: ResourceProfileCounter::new("Font Templates"),
image_templates: ResourceProfileCounter::new("Image Templates"),
texture_cache: TextureCacheProfileCounters::new(),
},
ipc: IpcProfileCounters {
serialize_time: TimeProfileCounter::new("IPC Serialize Time", false),
deserialize_time: TimeProfileCounter::new("IPC Deserialize Time", false),
send_time: TimeProfileCounter::new("IPC Send Time", false),
total_time: TimeProfileCounter::new("IPC Time", false),
display_len: IntProfileCounter::new("IPC Display List Len"),
aux_len: IntProfileCounter::new("IPC Aux List Len"),
}
}
}

@@ -562,6 +601,7 @@ pub struct Profiler {
compositor_time: ProfileGraph,
gpu_time: ProfileGraph,
gpu_frames: GpuFrameCollection,
ipc_time: ProfileGraph,
}

impl Profiler {
@@ -575,6 +615,7 @@ impl Profiler {
compositor_time: ProfileGraph::new(600),
gpu_time: ProfileGraph::new(600),
gpu_frames: GpuFrameCollection::new(),
ipc_time: ProfileGraph::new(600),
}
}

@@ -677,15 +718,24 @@ impl Profiler {
], debug_renderer, true);

self.draw_counters(&[
&backend_profile.font_templates,
&backend_profile.image_templates,
&backend_profile.resources.font_templates,
&backend_profile.resources.image_templates,
], debug_renderer, true);

self.draw_counters(&[
&backend_profile.texture_cache.pages_a8,
&backend_profile.texture_cache.pages_rgb8,
&backend_profile.texture_cache.pages_rgba8,
&backend_profile.texture_cache.pages_rg8,
&backend_profile.resources.texture_cache.pages_a8,
&backend_profile.resources.texture_cache.pages_rgb8,
&backend_profile.resources.texture_cache.pages_rgba8,
&backend_profile.resources.texture_cache.pages_rg8,
], debug_renderer, true);

self.draw_counters(&[
&backend_profile.ipc.serialize_time,
&backend_profile.ipc.send_time,
&backend_profile.ipc.deserialize_time,
&backend_profile.ipc.total_time,
&backend_profile.ipc.display_len,
&backend_profile.ipc.aux_len,
], debug_renderer, true);

self.draw_counters(&[
@@ -698,17 +748,23 @@ impl Profiler {
&renderer_timers.cpu_time,
&renderer_timers.gpu_time,
], debug_renderer, false);




self.backend_time.push(backend_profile.total_time.nanoseconds);
self.compositor_time.push(renderer_timers.cpu_time.nanoseconds);
self.ipc_time.push(backend_profile.ipc.total_time.nanoseconds);
self.gpu_time.push(gpu_time);
self.gpu_frames.push(gpu_time, gpu_samples);


let rect = self.backend_time.draw_graph(self.x_left, self.y_left, "CPU (backend)", debug_renderer);
self.y_left += rect.size.height + PROFILE_PADDING;
let rect = self.compositor_time.draw_graph(self.x_left, self.y_left, "CPU (compositor)", debug_renderer);
self.y_left += rect.size.height + PROFILE_PADDING;
let rect = self.ipc_time.draw_graph(self.x_left, self.y_left, "DisplayList IPC", debug_renderer);
self.y_left += rect.size.height + PROFILE_PADDING;
let rect = self.gpu_time.draw_graph(self.x_left, self.y_left, "GPU", debug_renderer);
self.y_left += rect.size.height + PROFILE_PADDING;
let rect = self.gpu_frames.draw(self.x_left,
@@ -123,7 +123,7 @@ impl RenderBackend {
}
match msg {
ApiMsg::AddRawFont(id, bytes) => {
profile_counters.font_templates.inc(bytes.len());
profile_counters.resources.font_templates.inc(bytes.len());
self.resource_cache
.add_font_template(id, FontTemplate::Raw(Arc::new(bytes)));
}
@@ -144,7 +144,7 @@ impl RenderBackend {
}
ApiMsg::AddImage(id, descriptor, data, tiling) => {
if let ImageData::Raw(ref bytes) = data {
profile_counters.image_templates.inc(bytes.len());
profile_counters.resources.image_templates.inc(bytes.len());
}
self.resource_cache.add_image_template(id, descriptor, data, tiling);
}
@@ -212,15 +212,18 @@ impl RenderBackend {
if !preserve_frame_state {
self.discard_frame_state_for_pipeline(pipeline_id);
}

let counters = &mut profile_counters.ipc;
profile_counters.total_time.profile(|| {
self.scene.set_display_list(pipeline_id,
epoch,
built_display_list,
background_color,
viewport_size,
auxiliary_lists);
auxiliary_lists,
counters);
self.build_scene();
})
});
}
ApiMsg::SetRootPipeline(pipeline_id) => {
profile_scope!("SetRootPipeline");
@@ -237,7 +240,7 @@ impl RenderBackend {
ApiMsg::Scroll(delta, cursor, move_phase) => {
profile_scope!("Scroll");
let frame = {
let counters = &mut profile_counters.texture_cache;
let counters = &mut profile_counters.resources.texture_cache;
profile_counters.total_time.profile(|| {
if self.frame.scroll(delta, cursor, move_phase) {
Some(self.render(counters))
@@ -258,7 +261,7 @@ impl RenderBackend {
ApiMsg::ScrollNodeWithId(origin, id) => {
profile_scope!("ScrollNodeWithScrollId");
let frame = {
let counters = &mut profile_counters.texture_cache;
let counters = &mut profile_counters.resources.texture_cache;
profile_counters.total_time.profile(|| {
if self.frame.scroll_nodes(origin, id) {
Some(self.render(counters))
@@ -280,7 +283,7 @@ impl RenderBackend {
ApiMsg::TickScrollingBounce => {
profile_scope!("TickScrollingBounce");
let frame = {
let counters = &mut profile_counters.texture_cache;
let counters = &mut profile_counters.resources.texture_cache;
profile_counters.total_time.profile(|| {
self.frame.tick_scrolling_bounce_animations();
self.render(counters)
@@ -380,7 +383,7 @@ impl RenderBackend {
}

let frame = {
let counters = &mut profile_counters.texture_cache;
let counters = &mut profile_counters.resources.texture_cache;
profile_counters.total_time.profile(|| {
self.render(counters)
})
@@ -805,14 +805,14 @@ impl Renderer {
ImageDescriptor::new(2, 2, ImageFormat::RGBA8, false),
TextureFilter::Linear,
ImageData::Raw(Arc::new(white_pixels)),
&mut backend_profile_counters.texture_cache);
&mut backend_profile_counters.resources.texture_cache);

let dummy_mask_image_id = texture_cache.new_item_id();
texture_cache.insert(dummy_mask_image_id,
ImageDescriptor::new(2, 2, ImageFormat::A8, false),
TextureFilter::Linear,
ImageData::Raw(Arc::new(mask_pixels)),
&mut backend_profile_counters.texture_cache);
&mut backend_profile_counters.resources.texture_cache);

let dummy_cache_texture_id = device.create_texture_ids(1, TextureTarget::Array)[0];
device.init_texture(dummy_cache_texture_id,
@@ -3,9 +3,11 @@
* file, You can obtain one at http://mozilla.org/MPL/2.0/. */

use fnv::FnvHasher;
use profiler::IpcProfileCounters;
use std::collections::HashMap;
use std::hash::BuildHasherDefault;
use tiling::AuxiliaryListsMap;
use time::precise_time_ns;
use webrender_traits::{AuxiliaryLists, BuiltDisplayList, PipelineId, Epoch, ColorF};
use webrender_traits::{DisplayItem, DynamicProperties, LayerSize, LayoutTransform};
use webrender_traits::{PropertyBinding, PropertyBindingId};
@@ -119,10 +121,24 @@ impl Scene {
built_display_list: BuiltDisplayList,
background_color: Option<ColorF>,
viewport_size: LayerSize,
auxiliary_lists: AuxiliaryLists) {
auxiliary_lists: AuxiliaryLists,
profile_counters: &mut IpcProfileCounters) {

let display_list_len = built_display_list.data().len();
let aux_list_len = auxiliary_lists.data().len();
let (serial_start_time, serial_end_time) = built_display_list.serialization_times();

let deserial_start_time = precise_time_ns();

self.pipeline_auxiliary_lists.insert(pipeline_id, auxiliary_lists);
self.display_lists.insert(pipeline_id, built_display_list.into_display_items());

let deserial_end_time = precise_time_ns();

profile_counters.set(serial_start_time, serial_end_time,
deserial_start_time, deserial_end_time,
display_list_len, aux_list_len);

let new_pipeline = ScenePipeline {
pipeline_id: pipeline_id,
epoch: epoch,
@@ -20,6 +20,7 @@ ipc-channel = {version = "0.7", optional = true}
offscreen_gl_context = {version = "0.8", features = ["serde"], optional = true}
serde = "0.9"
serde_derive = "0.9"
time = "0.1"

[target.'cfg(target_os = "macos")'.dependencies]
core-graphics = "0.7"
@@ -5,6 +5,7 @@
use app_units::Au;
use std::mem;
use std::slice;
use time::precise_time_ns;
use {BorderDetails, BorderDisplayItem, BorderWidths, BoxShadowClipMode, BoxShadowDisplayItem};
use {ClipDisplayItem, ClipId, ClipRegion, ColorF, ComplexClipRegion, DisplayItem, ExtendMode};
use {FilterOp, FontKey, GlyphInstance, GlyphOptions, Gradient, GradientDisplayItem, GradientStop};
@@ -51,6 +52,10 @@ pub struct BuiltDisplayList {
pub struct BuiltDisplayListDescriptor {
/// The size in bytes of the display list items in this display list.
display_list_items_size: usize,
/// The first IPC time stamp: before any work has been done
serialization_start_time: u64,
/// The second IPC time stamp: after serialization
serialization_end_time: u64,
}

impl BuiltDisplayListDescriptor {
@@ -91,6 +96,9 @@ impl BuiltDisplayList {
}
}

pub fn serialization_times(&self) -> (u64, u64) {
(self.descriptor.serialization_start_time, self.descriptor.serialization_end_time)
}
}

#[derive(Clone)]
@@ -542,17 +550,23 @@ impl DisplayListBuilder {

pub fn finalize(self) -> (PipelineId, BuiltDisplayList, AuxiliaryLists) {
unsafe {
let serialization_start_time = precise_time_ns();

let blob = convert_vec_pod_to_blob(self.list);
let display_list_items_size = blob.len();
let aux_list = self.auxiliary_lists_builder.finalize();

let serialization_end_time = precise_time_ns();

(self.pipeline_id,
BuiltDisplayList {
descriptor: BuiltDisplayListDescriptor {
display_list_items_size: display_list_items_size,
display_list_items_size: blob.len(),
serialization_start_time: serialization_start_time,
serialization_end_time: serialization_end_time,
},
data: blob,
},
self.auxiliary_lists_builder.finalize())
aux_list)
}
}
}
@@ -20,6 +20,7 @@ extern crate offscreen_gl_context;
extern crate serde;
#[macro_use]
extern crate serde_derive;
extern crate time;

#[cfg(target_os = "macos")]
extern crate core_graphics;
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.