Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream scope information only once #169

Merged
merged 43 commits into from
Jan 8, 2024
Merged

Conversation

TimonPost
Copy link
Member

@TimonPost TimonPost commented Nov 24, 2023

This PR optimizes puffin bandwidth and scope times. This is achieved by only sending scope details once and via a separate stream. In short this PR optimizes bandwidth, reduces string cloning for file name, function name, and scope name, and also improves the performance of a profile scope.

Migration guide.

Libraries using the profiling macros profile_function and profile_scope! don't have to change anything. This change is relevant and impactful for external libraries that register custom scopes manually. For example, timings gathered from GPU.

A scope exists out of:

  • metadata: function name, file path, line number, scope name
  • Time information: start ms, end ms

Before this PR we streamed all information every time we recorded a scope to the puffin GlobalProfiler. After this PR, we only stream the metadata once the scope is first seen. Conceptually this makes a lot of sense but has some implications too users working with custom scopes not generated by the macros. Namely, a user library needs to register a scope once before reporting it to puffin.

The main approach right now is the following:

  • The sink is supposed to have a FrameView, a frame view stores a complete collection of all scope details that has been registered.
  • Scope meta-data can be requested via ScopeId (unique scope id for every scope). Scope meta-data is stored inside the ScopeCollection which every FrameView owns.
fn print_scopes(frame_view: &FrameView) {
    let scope_collection = frame_view.scope_collection();

    for (name, scope_id) in scope_collection.scopes_by_name() {
        println!("{name:?} {scope_id:?}");
    }

    for (scope_id, scope_details) in scope_collection.scopes_by_id() {
        println!("{scope_id:?} {scope_details:?}");
    }
}

fn register_scopes_once(frame_view: &puffin::FrameView) {
    // This stores all scope details - can be used to find if scopes are already registered.
    let scope_collection = frame_view.scope_collection();

    // Those are the new custom scope details.
    let mut custom_scope_details = vec![
        puffin::ScopeDetails::from_scope_name("gpu")
                .with_file("gpu_file.rs")
                .with_line_nr(1),
        puffin::ScopeDetails::from_scope_name("gpu1")
                .with_file("gpu_file1.rs")
                .with_line_nr(1),
    ];

    // Only cal once for each new scope.
    // Use `scope_collection` to see if the scopes are new before registering.
    let new_scope_ids = puffin::GlobalProfiler::lock().register_user_scopes(&custom_scope_details);
}

fn report_custom_scopes() {
    // This is your frame view, can be the global frame view which will hook into the global profiler.
    // But can also be your own frame view with your own sink.
    let frame_view = puffin::GlobalFrameView::default();

    // Scopes should ONLY be registered once.
    // There are various ways to do this but is specific to your application.
    register_scopes_once(&frame_view.lock());

    // Custom assembled streams that capture time information for the registered scopes.
    let stream: puffin::Stream = _;

    // Report the custom user scopes.
    GlobalProfiler::lock().report_user_scopes(
        puffin::ThreadInfo {
            start_time_ns: None,
            name: "gpu".to_owned(),
        },
        &puffin::StreamInfo {
            num_scopes: 10,
            stream,
            depth: 5,
            range_ns: (0.0..10.0),
        }
        .as_stream_into_ref(),
    );

    // Iterate through all scopes, scopes are propagated from `GlobalProfiler` to the sink which writes them to the `FrameView`.
    print_scopes(&frame_view.lock());
}

Benchmarks

In general a 8% speed up compared to the main on my mac with M1 max, about 10% on my computer AMD Ryzen 9 5900X. It uses roughtly about 70% less bandwidth.

profile_function        time:   [67.497 ns 67.811 ns 68.202 ns]
                        change: [-12.856% -10.578% -8.4281%] (p = 0.00 < 0.05)
profile_function_data   time:   [67.549 ns 68.003 ns 68.670 ns]
                        change: [-13.471% -11.663% -9.9536%] (p = 0.00 < 0.05)
profile_scope           time:   [67.110 ns 67.284 ns 67.473 ns]
                        change: [-7.5609% -6.5684% -5.4889%] (p = 0.00 < 0.05)
profile_scope_data      time:   [68.673 ns 69.036 ns 69.492 ns]
                        change: [-7.5932% -6.6513% -5.6345%] (p = 0.00 < 0.05)
profile_function_off    time:   [2.6392 ns 2.6413 ns 2.6438 ns]
                        change: [+11.028% +11.602% +12.444%] (p = 0.00 < 0.05)
profile_function_data_off
                        time:   [2.4242 ns 2.4272 ns 2.4318 ns]
                        change: [-7.5747% -7.0304% -6.4890%] (p = 0.00 < 0.05)
profile_scope_off       time:   [2.3711 ns 2.3782 ns 2.3856 ns]
                        change: [+0.0496% +0.5844% +1.1478%] (p = 0.04 < 0.05)
profile_scope_data_off  time:   [2.3753 ns 2.3835 ns 2.3920 ns]
                        change: [-9.3472% -8.5543% -7.9134%] (p = 0.00 < 0.05)

Profile Locks

atomic set/get (criterion can not measure time)
thread_local_set        time:   [946.26 ps 950.81 ps 956.17 ps]
rw_lock_get             time:   [13.754 ns 13.788 ns 13.826 ns]
rw_lock_set             time:   [12.923 ns 12.962 ns 13.000 ns]
once_cell               time:   [315.48 ps 316.55 ps 317.65 ps]
once_lock               time:   [690.91 ps 694.27 ps 697.63 ps]

Ideally we want to use once_cell in the profiler macros however this seems to cause some deadlocks when running the profiler and I'm not sure why. It seems to run fine when just profiling things. We should be able to have a cell here as its thread local right?

Changes

  • Remove location and id from the profile scope.
  • Remove string parsing in macros, instead, this is now done when creating a new frame for the GlobalProfiler.
  • Put ScopeCollection into FrameView such that each view contains its own collection of scopes. Its possible to get a full snapshot of all seen scope details to each sink.
  • How to pass the scope details to the GlobalProfiler
    I pass the profiler scope details via type array to the GlobalProfiler, I started off with encoding it into a separate stream, but this complicates parsing/reading logic and is not really needed. For the HTTP server we can just serialize the new scopes and send them to the user.
  • Change scope id to be the incremental monolithic integer instead of a string of name.
  • Add new HTTP server/client protocol header
  • Upgrade protocol version

Fixes: #168

@TimonPost TimonPost changed the title Timon/stream scope info once Stream scope information only once Nov 24, 2023
@TimonPost TimonPost force-pushed the timon/stream-scope-info-once branch 6 times, most recently from 2dd7719 to a7ec5ef Compare November 28, 2023 14:33
puffin/src/data.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very exciting!

The new integer id:s should also speed up the scope merging in the UI

@TimonPost TimonPost marked this pull request as ready for review November 28, 2023 14:52
@TimonPost
Copy link
Member Author

TimonPost commented Nov 28, 2023

I opened for review, feel free to start commenting, I'm not completely done yet with streaming the scope details via HTTP server, but most logic is in place. And there is some thinking needed on how to and where to add the sync primitives.

Also I'm not entirely sure what to do with the 'location' in the merge note, think I'm cloning it more then necessary at the moment.

puffin_http/src/server.rs Outdated Show resolved Hide resolved
@TimonPost TimonPost removed the request for review from NiklasNummelin November 28, 2023 21:05
puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start, but I find the five different kinds of ScopeDetails quite confusing, including their fields.

puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
@TimonPost
Copy link
Member Author

TimonPost commented Dec 1, 2023

Thanks for the detailed review! I took a closer look at the 4 detailed scopes, and it seemed like it was indeed possible to boil it down to just one.

I did the following:

  • Remove CustomScopeDetails, SerdeScopedDetails, ScopeDetailsRef and made ScopeDetailsOwned to be ScopeDetails which now also has a builder functions for the external users and uses Cow to preserve static refs.
  • Remove the raw fields from ScopeDetailsOwned as they were not really used and caused confusion with dynamic_xyz fields which are now have no prefix e.g function_name, file_path.
  • Improved some consistency in naming

puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
@TimonPost
Copy link
Member Author

TimonPost commented Dec 19, 2023

Thanks for all reviews. I will follow up sometime after this PR to update and add lots of clippy checks so we can automize that. I processed your commend and improved the popup windows a bit. I'll try make the width smaller.

image image image

@TimonPost TimonPost requested a review from emilk December 19, 2023 13:31
Copy link
Collaborator

@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take care to test the ui thoroughly

puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin_egui/src/flamegraph.rs Outdated Show resolved Hide resolved
puffin_egui/src/flamegraph.rs Outdated Show resolved Hide resolved
puffin_egui/src/flamegraph.rs Outdated Show resolved Hide resolved
puffin_egui/src/flamegraph.rs Outdated Show resolved Hide resolved
puffin/src/frame_data.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Show resolved Hide resolved
puffin_egui/src/flamegraph.rs Outdated Show resolved Hide resolved
puffin_egui/src/flamegraph.rs Outdated Show resolved Hide resolved
puffin_egui/src/flamegraph.rs Outdated Show resolved Hide resolved
puffin_egui/src/flamegraph.rs Outdated Show resolved Hide resolved
puffin_egui/src/lib.rs Outdated Show resolved Hide resolved
@TimonPost TimonPost requested review from repi and removed request for repi December 22, 2023 12:06
Copy link
Collaborator

@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Show resolved Hide resolved
puffin_egui/src/flamegraph.rs Outdated Show resolved Hide resolved
puffin/src/scope_details.rs Outdated Show resolved Hide resolved
Because scope collection is more of a data structure that users might use them selves. And allocating id's is more of a puffin specific thing. This was feedback from Traverse.
Copy link
Contributor

@repi repi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet

@TimonPost
Copy link
Member Author

Thanks @emilk for all the review comments! I'll merge this now, but won't do a release for about a week. Will maybe do some follow-ups and also update my own branch for Embarks internal project. Also, @MarijnS95 was working on migration and progressed some feedback from him in this PR. We can take it easy and see if any additional feature requests come in from concrete implementations.

I also updated the migration help in the description, the changelog and the protocol version number.

@TimonPost TimonPost merged commit 2015dfc into main Jan 8, 2024
4 of 6 checks passed
@TimonPost TimonPost deleted the timon/stream-scope-info-once branch January 8, 2024 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Puffin scopes can no longer be used in crates with #[forbid(unsafe_code)]
4 participants