Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for inline stacks #3556

Merged
merged 10 commits into from Nov 30, 2021
Merged

Conversation

mstange
Copy link
Contributor

@mstange mstange commented Sep 16, 2021

Fixes #2121.

Firefox profile: Without inline stacks / With inline stacks
Rust profile: Without inline stacks / With inline stacks

To try this out, you'll need to use profiler-symbol-server at the moment - inline stacks don't work yet with the Mozilla symbolication API or when profiling local builds. Here's how you can try it out:

  1. Make sure profiler-symbol-server is installed, or run cargo install profiler-symbol-server.
  2. In Firefox, start the profiler, and record for a bit.
  3. Open the browser console and execute await Services.profiler.dumpProfileToFileAsync("/Users/mstange/Desktop/gfx-profile.json") (with an appropriate path)
  4. Now you have a raw, unsymbolicated gecko profile.
  5. To see what happens with today's profiler.firefox.com, run profiler-symbol-server path/to/profile.json. Then stop profiler-symbol-server with Ctrl+C.
  6. To see what happens with the deploy proview, run PROFILER_URL="https://deploy-preview-3556--perf-html.netlify.app" profiler-symbol-server path/to/profile.json.

Overview

This PR consists of three big pieces:

  1. The profile format change which lets us represent inline frames in the profile.
  2. The symbolication change to accept a new inlines array per address from the symbolication API and to take this information into account during symbolication.
  3. A visual annotation in the call tree, to differentiate inlined function calls from non-inlined function calls.

The format

The following is true today, before this PR:

  • A single code address (e.g. 0x1234) maps to a single function.
  • For native code, for each library, every address corresponds to a single frame in the frameTable.
  • Multiple frames from the same function share the same func, but they'll all have different addresses. A frame also has a line number - the line in the function that generated the instruction at the frame's address.

But with inline stacks, a single address can now map to more than one function, and more than one line number. Should we still have a single frame per address? If we want to keep a single frame per address, where would we store the list of functions?

Alternative 1 (unimplemented): Keep one frame per address, add side-table for inline information

This is not the format I chose to implement. I'm describing it here anyway because it's an alternative I considered at one point during development, and I think it's valuable to record for posterity.

We could replace the frameTable's func column with an inlineStackInfo column, which would have an index into a new inlineStackInfoTable. This new table would have the following fields:

  • parentInlineStackInfo
  • func
  • line

This would mean that symbolication just needs to create the inlineStackInfoTable and can keep the frameTable mostly unchanged, just updating each frame's inlineStackInfo field. (Today it just updates the frame's func field.) And it could leave the stackTable unchanged, too. And then the actual expansion into multiple funcs would happen whenever we convert from a "stack" to a "call node".

One place where this expansion would need to happen would be when we create the CallNodeTable for the call tree. At the moment, we create the call tree by collapsing sibling "stack" nodes into the same "call" node, if those "stack" nodes share the same "func". With this extra table, creating the call tree would become a fair bit more complicated: Each frame would need to be expanded into its set of funcs with the help of the inlineStackInfoTable, and then call nodes would need to be created based on those expanded funcs.

Furthermore, all of our call tree transforms would need to be updated: For example, some of them have "call paths", but they operate on the stackTable, and matching the combination of stackTable + inlineStackInfoTable against a call path might be complicated. Furthermore, we have some transforms that change the tree structure of the stack table, for example the "Merge node" transform. This transform would now need to be able to operate both on the stack table and on the inlineStackInfoTable. This seems hard.

In general, we probably have a ton of places that make use of the frameTable's func column; all of them would need to be updated. This might be doable but would be rather invasive.

I'm also not sure if it would be a good format. The tree structure of the call tree is now in two places: In the stack table and the inlineStackInfoTable. It might be harder to understand during debugging.

I was too afraid to do this, so I chose a simpler implementation.

Alternative 2 (this is the one I chose): Have multiple frames per address, add an inlineDepth column to the frameTable

This is the approach I chose. Rather than insisting on only having one frame per address, I am now creating multiple frames for the same address: One frame for each "inline depth". Frames at the outer level have an inlineDepth of zero.

For example, let's say you have a frame with address 0x123. We look up 0x123 with the symbolication API, and get the following result:

{
  "inlines": [
    {
      "function": "profiler_thread_is_being_profiled_for_markers",
      "file": "ProfilerMarkers.h"
      "line": 129
    },
    {
      "function": "profiler_add_marker",
      "file": "ProfilerMarkers.h"
      "line": 149
    }
  ],
  "function": "BackgroundHangThread::NotifyActivity",
  "file": "toolkit/components/backgroundhangmonitor/BackgroundHangMonitor.cpp",
  "line": 255
}

There will be three funcs: "BackgroundHangThread::NotifyActivity @ toolkit/components/backgroundhangmonitor/BackgroundHangMonitor.cpp", "profiler_add_marker @ ProfilerMarkers.h", and "profiler_thread_is_being_profiled_for_markers @ ProfilerMarkers.h".
Now we will create three frames:

  1. address: 0x123, inlineDepth: 0, line: 255, func: <"BackgroundHangThread::NotifyActivity">
  2. address: 0x123, inlineDepth: 1, line: 149, func: <"profiler_add_marker">
  3. address: 0x123, inlineDepth: 2, line: 129, func: <"profiler_thread_is_being_profiled_for_markers">

And the stackTable gets new stack nodes to connect these new frames up in the right ways.

And the samples table is updated to point to the "inner" stack nodes. Same for any other references to stacks elsewhere in the thread.

Comparison

Compared to the first alternative, this has the following advantages:

  1. The format is much simpler.
  2. A ton of existing code can stay the way it is, because the stack tree and the call node tree keep the same structure.

These advantages come at the expense of the simplicity of the symbolication code:

  1. Symbolication is more complicated. Symbolication now needs to create new frames and change the tree structure in the stack table. It also needs to replace any stack references in the rest of the thread to be valid with the updated stack table.

I think this is the right trade-off.

Other challenges

There are a few other things that I needed to get right.

One of them is re-symbolication. Re-symbolication is a rare scenario, but it's worth getting right anyway. In this scenario, we want to throw out all the old information because it might have bad symbols and bad inlining data, and start from scratch based on the frame addresses. To do this, I'm throwing out all frames with inlineDepth > 0 at the start of symbolication, and I'm creating a "flattened" stack table that has the inline frames removed. This isn't very pretty, and it's necessary because symbolication now permanently mutates the structure of the stack table (but in a somewhat straightforward and non-destructive way).

Another challenge was about dealing with call paths in the redux state. The selected node, and the right-clicked node, and the expanded nodes are all stored as call paths, i.e. arrays of func indexes from the root. Once symbolication completes, some of the nodes in the call tree might be replaced by multiple functions. This means that the stored call paths need to be modified to take this into account. To avoid any call tree nodes from closing, we need to add some of the new nodes to the set of expanded nodes, if any of their descendant nodes are expanded.

All right, that's all I can think of for now.

Happy reviewing! Please review each commit individually.

@mstange mstange self-assigned this Sep 16, 2021
@codecov
Copy link

codecov bot commented Sep 16, 2021

Codecov Report

Merging #3556 (8607bb5) into main (aba7879) will decrease coverage by 0.12%.
The diff coverage is 79.18%.

❗ Current head 8607bb5 differs from pull request most recent head 02e1fad. Consider uploading reports for the commit 02e1fad to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3556      +/-   ##
==========================================
- Coverage   88.81%   88.68%   -0.13%     
==========================================
  Files         261      259       -2     
  Lines       21745    21905     +160     
  Branches     5568     5598      +30     
==========================================
+ Hits        19313    19427     +114     
- Misses       2254     2294      +40     
- Partials      178      184       +6     
Impacted Files Coverage Δ
src/profile-logic/data-structures.js 95.45% <ø> (ø)
src/profile-logic/mozilla-symbolication-api.js 21.68% <0.00%> (-4.41%) ⬇️
src/profile-logic/process-profile.js 91.22% <ø> (ø)
src/profile-logic/symbol-store.js 93.18% <ø> (ø)
src/test/fixtures/profiles/call-nodes.js 100.00% <ø> (ø)
src/components/shared/TreeView.js 77.63% <50.00%> (-0.24%) ⬇️
src/profile-logic/profile-data.js 90.48% <69.23%> (-0.08%) ⬇️
src/reducers/profile-view.js 95.08% <75.00%> (+0.18%) ⬆️
src/profile-logic/symbolication.js 87.27% <85.03%> (-3.79%) ⬇️
src/profile-logic/call-tree.js 93.30% <90.90%> (-0.12%) ⬇️
... and 36 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aba7879...02e1fad. Read the comment docs.

@mstange mstange force-pushed the inline-stacks branch 5 times, most recently from 8ecea2d to eb6c2d4 Compare November 23, 2021 03:12
@mstange mstange requested a review from canova November 23, 2021 03:12
@mstange mstange marked this pull request as ready for review November 23, 2021 03:12
@mstange mstange requested a review from a team as a code owner November 23, 2021 03:12
locales/en-US/app.ftl Outdated Show resolved Hide resolved
Copy link
Member

@canova canova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's some solid work! Thanks for the PR, I like the new inline badge and it's great to have inline stacks!
I added some comments but I don't think there is any major blockers, they are mostly nits and some small things.

Also sorry about the delay in the reviews, it took me some time to concentrate and finish it.

One of them is re-symbolication. Re-symbolication is a rare scenario, but it's worth getting right anyway. In this scenario, we want to throw out all the old information because it might have bad symbols and bad inlining data, and start from scratch based on the frame addresses. To do this, I'm throwing out all frames with inlineDepth > 0 at the start of symbolication, and I'm creating a "flattened" stack table that has the inline frames removed. This isn't very pretty, and it's necessary because symbolication now permanently mutates the structure of the stack table (but in a somewhat straightforward and non-destructive way).

It's unclear to me how symbols server could return bad inlining data. Do you have anything specific in your mind or is this to mostly make sure (since it's new addition)?

src/types/profile.js Show resolved Hide resolved
src/actions/receive-profile.js Outdated Show resolved Hide resolved
src/reducers/profile-view.js Outdated Show resolved Hide resolved
src/reducers/profile-view.js Show resolved Hide resolved
locales/en-US/app.ftl Show resolved Hide resolved
res/img/svg/inlined-icon.svg Outdated Show resolved Hide resolved
src/components/shared/TreeView.js Outdated Show resolved Hide resolved
src/profile-logic/call-tree.js Show resolved Hide resolved
src/test/store/symbolication.test.js Show resolved Hide resolved
@mstange
Copy link
Contributor Author

mstange commented Nov 30, 2021

It's unclear to me how symbols server could return bad inlining data. Do you have anything specific in your mind or is this to mostly make sure (since it's new addition)?

"Bugs" :)

For example, there could be a bug in profiler-get-symbols where it returns incorrect filenames for some inlined frames. And then we fix the bug and want to see the same profile again with correct filenames, so we re-symbolicate.

Or we could have correct inlining data for most functions, but we might be missing function names for some addresses. Then we plug that hole so that we get function names for those addresses. If we want to re-symbolicate to pick up those new function names, we always re-symbolicate all addresses of that library, so the functions with inline information will be re-symbolicated, too. In that case we don't want to end up with duplicated inline frames.

@mstange
Copy link
Contributor Author

mstange commented Nov 30, 2021

Thanks for the review!

@canova
Copy link
Member

canova commented Nov 30, 2021

"Bugs" :)

For example, there could be a bug in profiler-get-symbols where it returns incorrect filenames for some inlined frames. And then we fix the bug and want to see the same profile again with correct filenames, so we re-symbolicate.

Or we could have correct inlining data for most functions, but we might be missing function names for some addresses. Then we plug that hole so that we get function names for those addresses. If we want to re-symbolicate to pick up those new function names, we always re-symbolicate all addresses of that library, so the functions with inline information will be re-symbolicated, too. In that case we don't want to end up with duplicated inline frames.

I see, thanks :)

This answers the question "Why wasn't this inlined!?" which some
people who are used to the old inline-less call tree might have.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support showing inline frames in profiles
3 participants