Backend Communication Profiling #3382

MichaelMauderer · 2022-04-06T12:19:04Z

Pull Request Description

Expands the profiling tooling

adds Enso specific metadata functionality
adds profiling to backend communication
enables visualizations of async task that are not active
enables profiling of FRP outputs (important to track user interactions that trigger computations that take a lot of time)
expands the sample demo scene rendering profiling data to easily allow rendering of custom profiling data

This shows the updated sample data. The grey bars show an async task that is not currently active. The vertical bars show a "Mark" that has been set in the timeline.

Peek.2022-04-06.15-01.mp4

[ci no changelog needed]

Checklist

Please include the following checklist in your PR:

The documentation has been updated if necessary.
All code conforms to the Scala, Java, and Rust style guides.
All code has been tested:
- Unit tests have been written where possible.
- If GUI codebase was changed: Enso GUI was tested when built using BOTH ./run dist and ./run watch.

…_profiling

wdanilo · 2022-04-08T04:30:40Z

@kazcw could you review it pls? I would just do super fast check after your review.

wdanilo · 2022-04-08T04:30:40Z

@kazcw could you review it pls? I would just do super fast check after your review.

…_profiling

kazcw · 2022-04-08T15:04:29Z

lib/rust/ensogl/component/flame-graph/src/lib.rs

+    component
+}
+
+/// Instantiate a `Block` shape for the given block data from the profiler.


Comment is from the other function.

kazcw · 2022-04-08T15:05:08Z

lib/rust/ensogl/component/flame-graph/src/lib.rs

    component
 }

+const MIN_INTERVALL_TIME: f64 = 0.0;


"INTERVAL". Also, units?

kazcw · 2022-04-08T15:15:00Z

lib/rust/ensogl/component/flame-graph/src/lib.rs

+
+        let blocks_marks_aligned = marks.into_iter().map(|mut mark| {
+            mark.position -= origin_x as f64;
+            mark.position += X_SCALE;


kazcw · 2022-04-08T15:15:27Z

lib/rust/ensogl/component/flame-graph/src/lib.rs

    }

    /// Return a reference to the blocks that make up the flame graph.
    pub fn blocks(&self) -> &[Block] {
        &self.blocks
    }
+    /// Return a reference to the blocks that make up the flame graph.


Copied comment.

kazcw · 2022-04-08T15:15:44Z

lib/rust/ensogl/component/flame-graph/src/mark.rs

@@ -0,0 +1,190 @@
+//! A single block component that is used to build up a flame graph.


kazcw · 2022-04-08T15:25:02Z

lib/rust/ensogl/core/src/profiler.rs

+
+/// Log an RPC Event to the profiling framework.
+pub fn log_rpc_event(event_name: &'static str) {
+    let event_logger = enso_profiler::MetadataLogger::new("RpcEvent");


Each MetadataLogger registers itself separately with the logging framework. For performance we should create one per type, not one per event. It can be done with lazy_static.

Good point. lazy_static seems not to be enough as MetadataLogger is not Sync, but thread_local does the trick.

kazcw · 2022-04-08T15:29:14Z

lib/rust/ensogl/core/src/profiler.rs

+#[derive(Debug, Clone, serde::Serialize, serde::Deserialize, Eq, PartialEq)]
+pub enum Metadata {
+    /// An RPC event that was received from the backend.
+    RpcEvent(String),


Why should an ensogl module know anything about RPC? profiler supports arbitrarily-many dependency-injected metadata types so that we can separate concerns--a metadata type defined here should know only about EnsoGL metadata, and we can have one somewhere else that's appropriate for RPC messages.

To clarify this, the idea of this API design is that we can use a big enum that depends on everything when we're analyzing the data (because data consumers will need access to data type definitions from all over the app), but we shouldn't try to use such a cross-cutting enum when logging metadata from all over the codebase (it mixes concerns, and the dependencies all go circular). That's why MetadataLogger doesn't rely on an enum of all metadata types; instead it supports defining different variants of the enum at different places in the code. They come together in the serialized log, and then we can use a big all-knowing enum only for deserializing.

We will need a place to define the all-inclusive Metadata enum--a crate for Enso-specific profile-interpreting code, that is very high level; it will have dependencies on all the app's runtime crates that define metadata types. We could call it something like enso-profiler-data-tools (it defines metadata and tools for building on enso-profiler-data). When I implement the tool that integrates data from the backend I'll merge that into the same crate.

Okay. But that means we need to rely on some convention that is distributed across the codebase for naming. We have the reading and writing in quite different places, but the names used need to stay in sync. But I agree that any other solution will end up with dependency issues eventually, so I don's see another nice solution.

We could gather the names all in one place that defines them as consts for use when registering each MetadataLogger--serialization and deserialization would still be decoupled, but at least it would be easy to compare the metadata enum with the list of names.

That place could just be the independent crate, which would be imported in many places. But that crate would only need some 3rd party libs and the profiling crate. So, no circular dependencies.

I think I like that better.

Well no, wait. If we want to log custom data, we would either need to define that data in that crate, or we would end up with cyclic dependencies. But if we define the data in that crate, we would probably end up duplicating the data structures we would like to log and need to implement conversion to those.

I'm thinking especially of the cases like the EnsoGL stats, where we already have a data struct like

pub struct StatsData { pub frame_time: f64, pub fps: f64, pub wasm_memory_usage: u32, pub gpu_memory_usage: u32, pub draw_call_count: usize, pub buffer_count: usize, pub data_upload_count: usize, pub data_upload_size: u32, pub sprite_system_count: usize, pub sprite_count: usize, pub symbol_count: usize, pub mesh_count: usize, pub shader_count: usize, pub shader_compile_count: usize, }

Which is generated by a macro, and can be easily serialized. Creating a duplicate and defining a conversion seems unnecessary boilerplate.

But we also need those structs for deserialisaiton anyway. So, there really is no good place to put this if we don't use our own structs. I guess we do need to duplicate those or we get into dependency issues eventually. Even putting this into Enso core would break as soon as we want to log some struct from some other crate that depends on Enso core.

Or am I missing something here?

We can't put the data definitions in a common crate; data consumers will need to depend on various application crates so that they can access types like StatsData. But I'm thinking it would be useful to define all the loggable types' names (for use with MetadataLogger::new) in a common crate: Just a bunch of declarations like const STATS_DATA: str = "StatsData"; (we could encourage this pattern with a newtype used here and as the parameter type of MetadataLogger::new). This would be enough so that we could see the list of names of serializable types, and compare it to the types a Metadata enum can deserialize.

The type definitions used for serialization and deserialization are only linked informally (nothing statically ensures they are compatible), but with serialization this is kind of true anyway; even if we used one type definition for both sides in a particular build, we may be deserializing a file that was created by a build of the app with different definitions. It's outside the realm of static guarantees, but that's what RecoverableError is for.

I'm not sure if I see a big benefit to that list. The usages will be all over the place anyway.
I restructured it now, but did not include this list of names as a separate crate. Let me know if you see this as super useful, and I’ll include it.

kazcw · 2022-04-08T15:34:11Z

lib/rust/ensogl/example/profiling-run-graph/src/lib.rs

@@ -1,4 +1,6 @@
-//! Demo scene showing a sample flame graph.
+//! Demo scene showing a sample flame graph. Can be used to display a log file, if you have one.
+//! To do so, set the `PROFILER_LOG_DATA` to contain the profiling log via `include_str`, and it


Why set a profile at compile time? We currently render a file obtained at runtime from dist/content/proflie.json

Loading the data at runtime is a bit more useful, just a bot more work. But I changed to do load the data instead of hard-coding the content..

lib/rust/json-rpc/Cargo.toml

lib/rust/profiler/flame-graph/src/lib.rs

kazcw · 2022-04-12T14:50:29Z

lib/rust/profiler/flame-graph/src/lib.rs

-                const DURATION_FLOOR_MS: f64 = 3.0;
-                if end < start + DURATION_FLOOR_MS {
-                    end = start + DURATION_FLOOR_MS;
+            for window in measurement.intervals.windows(2) {


Doesn't create any Active block for the last interval of each measurement.

Well spotted! Fixed.

wdanilo

@kazcw you did an amazing review, thanks for it!

wdanilo · 2022-04-14T02:06:33Z

app/gui/enso-profiler-metadata/Cargo.toml

+version = "0.1.0"
+edition = "2021"
+
+# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html


We are normally not keeping this comment in Cargo tomls I think.

wdanilo · 2022-04-14T02:06:41Z

app/gui/enso-profiler-metadata/src/lib.rs

+use std::fmt::Display;
+use std::fmt::Formatter;
+
+


sections missing

wdanilo · 2022-04-14T02:08:14Z

app/gui/view/graph-editor/src/lib.rs

+        frp.private.output.default_x_gap_between_nodes <+ default_x_gap;
+        frp.private.output.default_y_gap_between_nodes <+ default_y_gap;
+        frp.private.output.min_x_spacing_for_new_nodes <+ min_x_spacing;
    }
-    frp.source.default_x_gap_between_nodes.emit(default_x_gap.value());
-    frp.source.default_y_gap_between_nodes.emit(default_y_gap.value());
-    frp.source.min_x_spacing_for_new_nodes.emit(min_x_spacing.value());
+    frp.private.output.default_x_gap_between_nodes.emit(default_x_gap.value());
+    frp.private.output.default_y_gap_between_nodes.emit(default_y_gap.value());
+    frp.private.output.min_x_spacing_for_new_nodes.emit(min_x_spacing.value());


wow, you moved graph editor to the new fro definition, that's huge! ❤️

wdanilo · 2022-04-14T02:09:35Z

lib/rust/ensogl/component/flame-graph/src/mark.rs

+
+
+            (shape + hover_area).into()


please remove at least one empty line here. We can definitely use empty lines to group related lines together, but 2 empty lines is too much IMO. If you dont agree, lets talk about it! :)

wdanilo · 2022-04-14T02:09:50Z

lib/rust/ensogl/example/profiling-run-graph/Cargo.toml

+    'RequestMode',
+    'Response',
+    'Window',
+]


wdanilo · 2022-04-14T02:10:34Z

lib/rust/profiler/flame-graph/src/lib.rs

-                const DURATION_FLOOR_MS: f64 = 3.0;
-                if end < start + DURATION_FLOOR_MS {
-                    end = start + DURATION_FLOOR_MS;
+            for window in measurement.intervals.windows(2) {


this 2 looks strange and I dont understand where it comes from. If you'd refactor it to a var it would be so much better.

kazcw

There's one more thing missing in visit_measurement: Each measurement should have an inactive interval from its creation time (Measurement.created) to the start of its first active interval.

…_profiling

Add logging of EnsoGL performance stats to the profiling framework. Also extends the visualization in the debug scene to show an overview of the performance stats. We now render a timeline of blocks that indicate by their colour the rough FPS range we are in: https://user-images.githubusercontent.com/1428930/162433094-57fbb61a-b502-43bb-8815-b7fc992d3862.mp4 # Important Notes [ci no changelog needed] Needs to be merged after #3382 as it requires some changes about metadata logging from there. That is why this PR is currently still in draft mode and based on that branch.

See: [#181837344](https://www.pivotaltracker.com/story/show/181837344). I've separated this PR from some deeper changes I'm making to the profile format, because the changeset was getting too complex. The new APIs and tools in this PR are fully-implemented, except the profile format is too simplistic--it doesn't currently support headers that are needed to determine the relative timings of events from different processes. - Adds basic support for profile files containing data collected by multiple processes. - Implements `api_events_to_profile`, a tool for converting backend message logs (#3392) to the `profiler` format so they can be merged with frontend profiles (currently they can be merged with `cat`, but the next PR will introduce a merge tool). - Introduces `message_beanpoles`, a simple tool that diagrams timing relationships between frontend and backend messages. ### Important Notes - All TODOs introduced here will be addressed in the next PR that defines the new format. - Introduced a new crate, `enso_profiler_enso_data`, to be used by profile consumers that need to refer to Enso application datatypes to interpret metadata. - Introduced a `ProfileBuilder` abstraction for writing the JSON profile format; partially decouples the runtime event log structures from the format definition. - Introducing the conversion performed for `ProfilerBuilder` uncovered that the `.._with_same_start!` low-level `profiler` APIs don't currently work; they return `Started<_>` profilers, but that is inconsistent with the stricter data model that I introduced when I implemented `profiler_data`; they need to return profilers in a created, unstarted state. Low-level async profilers have not been a priority, but once #3382 merges we'll have a way to render their data, which will be really useful because async profilers capture *why* we're doing things. I'll bring up scheduling this in the next performance meeting.

MichaelMauderer self-assigned this Apr 6, 2022

MichaelMauderer force-pushed the wip/michaelmauderer/backend_communication_profiling branch from 51f143f to c7e49d8 Compare April 6, 2022 12:24

MichaelMauderer marked this pull request as ready for review April 6, 2022 13:03

MichaelMauderer requested review from wdanilo, farmaazon, mwu-tow and 4e6 as code owners April 6, 2022 13:03

MichaelMauderer and others added 4 commits April 6, 2022 14:19

Update GraphEditor FRP API.

0dcae64

Add ability to render marks in visualisation.

31c2968

Expand profiling. Update visualisation to show rpc events.

c7e49d8

Merge branch 'develop' into wip/michaelmauderer/backend_communication…

a264a77

…_profiling

kazcw mentioned this pull request Apr 7, 2022

Multi-frame shader compilation #3378

Merged

4 tasks

MichaelMauderer requested a review from kazcw April 8, 2022 08:23

MichaelMauderer mentioned this pull request Apr 8, 2022

Integrate Ensogl stats with profiling framework #3388

Merged

4 tasks

Merge branch 'develop' into wip/michaelmauderer/backend_communication…

6118d83

…_profiling

kazcw reviewed Apr 8, 2022

View reviewed changes

MichaelMauderer added 7 commits April 11, 2022 11:02

Implement easy PR feedback.

7ba61cb

Move metadata profiler functionality out of enso_core.

bb827e7

Read log data at runtime.

822b0c3

Read log data at runtime.

b9f4262

Add missing files.

0516cad

Read log data at runtime.

74351c8

Fix merge issues.

9ae601a

kazcw reviewed Apr 12, 2022

View reviewed changes

Restructure metadata definitions. Fix iteration bug.

ae2886e

wdanilo approved these changes Apr 14, 2022

View reviewed changes

kazcw mentioned this pull request Apr 15, 2022

Multi-process profiles. #3395

Merged

4 tasks

kazcw reviewed Apr 15, 2022

View reviewed changes

MichaelMauderer added 3 commits April 19, 2022 10:17

Merge branch 'develop' into wip/michaelmauderer/backend_communication…

6c6348e

…_profiling

Formatting.

a04bce0

Add missing intervall.

7023c9a

MichaelMauderer added the CI: Ready to merge This PR is eligible for automatic merge label Apr 19, 2022

Add @wdanilo to code owners for main Cargo.toml.

fe261ea

MichaelMauderer requested review from PabloBuchu and jdunkerley as code owners April 19, 2022 10:10

Merge branch 'develop' into wip/michaelmauderer/backend_communication…

338642a

…_profiling

MichaelMauderer merged commit 24e0f33 into develop Apr 19, 2022

MichaelMauderer deleted the wip/michaelmauderer/backend_communication_profiling branch April 19, 2022 11:30

This was referenced Feb 6, 2023

Developers should have a visualization that can show information from the backend-profiling. #4311

Closed

Profiling: Integrate event logs from application processes besides the IDE #4300

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backend Communication Profiling #3382

Backend Communication Profiling #3382

MichaelMauderer commented Apr 6, 2022 •

edited

Loading

wdanilo commented Apr 8, 2022

wdanilo commented Apr 8, 2022

kazcw Apr 8, 2022

kazcw Apr 8, 2022 •

edited

Loading

kazcw Apr 8, 2022

kazcw Apr 8, 2022

kazcw Apr 8, 2022

kazcw Apr 8, 2022

MichaelMauderer Apr 11, 2022

kazcw Apr 8, 2022

kazcw Apr 8, 2022

MichaelMauderer Apr 11, 2022

kazcw Apr 11, 2022

MichaelMauderer Apr 11, 2022

MichaelMauderer Apr 12, 2022

MichaelMauderer Apr 12, 2022 •

edited

Loading

kazcw Apr 12, 2022 •

edited

Loading

MichaelMauderer Apr 13, 2022

kazcw Apr 8, 2022 •

edited

Loading

MichaelMauderer Apr 11, 2022

kazcw Apr 12, 2022

MichaelMauderer Apr 13, 2022

wdanilo left a comment

wdanilo Apr 14, 2022

wdanilo Apr 14, 2022

wdanilo Apr 14, 2022

wdanilo Apr 14, 2022

wdanilo Apr 14, 2022

wdanilo Apr 14, 2022

kazcw left a comment

		@@ -0,0 +1,190 @@
		//! A single block component that is used to build up a flame graph.

Backend Communication Profiling #3382

Backend Communication Profiling #3382

Conversation

MichaelMauderer commented Apr 6, 2022 • edited Loading

Pull Request Description

Checklist

wdanilo commented Apr 8, 2022

wdanilo commented Apr 8, 2022

Choose a reason for hiding this comment

kazcw Apr 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaelMauderer Apr 12, 2022 • edited Loading

Choose a reason for hiding this comment

kazcw Apr 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kazcw Apr 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wdanilo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kazcw left a comment

Choose a reason for hiding this comment

MichaelMauderer commented Apr 6, 2022 •

edited

Loading

kazcw Apr 8, 2022 •

edited

Loading

MichaelMauderer Apr 12, 2022 •

edited

Loading

kazcw Apr 12, 2022 •

edited

Loading

kazcw Apr 8, 2022 •

edited

Loading