Stop treating import calls as import system frames #329

godlygeek · 2023-03-03T22:25:24Z

Up until now, our code has been treating the last location before the
import system is entered as an import system location as well, in order
to hide the import statement itself when the user clicks the "Hide
Import System Frames" checkbox.

There's a few problems with that approach, though. For one, there may be
other calls on the same line that lead to other, non-import system
allocations. For instance, the user may do something like:

import mmap; mmap.mmap(-1, 4096)

We don't want to hide that line on the flame graph, or users will lose
visibility into the mmap.mmap call when they try to hide imports.

A subtler problem with this approach is that we only handled the frame
with the import statement as an import system frame if the first child
we saw from it was an import system frame. Assuming it had a mix of some
descendants that are in the import system and some that aren't, our
behavior would change based on the order that the reporter received
those in (and the order that the reporter receives allocations in is
arbitrary and not guaranteed to be stable across runs).

Fix this by never hiding the import statement itself.

Also, add some micro optimizations here. These optimizations speed up
flamegraph generation by about 15% (for the nbody example, native mode).

Our existing flamegraph reporting code is subtly dependent upon the order in which allocations are passed to it. Add a test that illustrates this bug. This test is currently expected to fail, and will be made to pass in the next commit. Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>

Up until now, our code has been treating the last location before the import system is entered as an import system location as well, in order to hide the `import` statement itself when the user clicks the "Hide Import System Frames" checkbox. There's a few problems with that approach, though. For one, there may be other calls on the same line that lead to other, non-import system allocations. For instance, the user may do something like: import mmap; mmap.mmap(-1, 4096) We don't want to hide that line on the flame graph, or users will lose visibility into the `mmap.mmap` call when they try to hide imports. A subtler problem with this approach is that we only handled the frame with the `import` statement as an import system frame if the first child we saw from it was an import system frame. Assuming it had a mix of some descendants that are in the import system and some that aren't, our behavior would change based on the order that the reporter received those in (and the order that the reporter receives allocations in is arbitrary and not guaranteed to be stable across runs). Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>

Once we've found an import system frame, we don't need to check any other frames in that call stack, since everything below an `import` call is part of that import. Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>

Rather than calling `is_frame_from_import_system` every time we encounter a node, call it only when we first encounter a new node and add it to the tree. When a call stack reuses a previously seen node, also reuse its `is_import_system` flag. Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>

If a node has previously been added to the tree, we know that it isn't CPython internal (or else we would have skipped adding it). Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>

This is gonna sound stupid, but: `py-spy` tells me that around 6% of our flamegraph generation runtime in Python 3.10 is saved if we don't use star unpacking on this hot path, and instead depend upon `frame` having exactly 3 fields. Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>

godlygeek added the bug Something isn't working label Mar 3, 2023

godlygeek added 5 commits March 3, 2023 17:28

Avoid unneeded is_frame_from_import_system calls

b91139e

Once we've found an import system frame, we don't need to check any other frames in that call stack, since everything below an `import` call is part of that import. Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>

Never check if existing nodes are CPython internal

7dd8af1

If a node has previously been added to the tree, we know that it isn't CPython internal (or else we would have skipped adding it). Signed-off-by: Matt Wozniski <mwozniski@bloomberg.net>

godlygeek force-pushed the fix_import_system_hiding branch from 725c40d to de70269 Compare March 3, 2023 22:28

pablogsal approved these changes Mar 6, 2023

View reviewed changes

pablogsal merged commit 769b7d0 into bloomberg:main Mar 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop treating import calls as import system frames #329

Stop treating import calls as import system frames #329

godlygeek commented Mar 3, 2023

Stop treating import calls as import system frames #329

Stop treating import calls as import system frames #329

Conversation

godlygeek commented Mar 3, 2023