New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: pprof truncation #2754
feat: pprof truncation #2754
Conversation
@@ -13,193 +10,114 @@ type pprofProtoSymbols struct { | |||
profile googlev1.Profile | |||
symbols *Symbols | |||
samples *schemav1.Samples | |||
tree *model.StacktraceTree | |||
lut []uint32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it stands for ? Locations.... ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's yet another lookup table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The change is aimed at making location information such as file names and line numbers available in the frontend. Currently, we only preserve function names.
Internally we have two APIs for fetching profiling data:
SelectMergeStacktraces
returns data in the flamebearer (flamegraph) format containing function names. This method is used by frontend to built flamegraphs, top table, etc.SelectMergeProfile
returns data in pprof format, including file names, line numbers, etc. This method is used byprofilecli
and frontend for exporting data in pprof format.Use of
SelectMergeProfile
(pprof format) for building flamegraphs is complicated due to the fact that the data model is more sophisticated compared to flamegraph. Moreover, when we build a flamegraph, we only keep a specific number of "top nodes", to minimize the amount of work. The PR adds pprof truncation that is fully aligned with the approach we use inSelectMergeStacktraces
API (and thus with the resulting flamegraph):SelectMergeStacktraces
, therefore the PR has no immediate impact. Next step is to switch from theSelectMergeStacktraces
endpoint toSelectMergeProfile
in the query-frontend.From the performance point of view, use of pprof internally is beneficial for medium and large profiles (2-3 times faster), and might be more expensive in case of small ones (max_nodes goes after slash):
Please note that the method we're using currently (
Benchmark_block_Resolver_ResolveTree
) does not support truncation at symbolication, we can only drop insignificant nodes after the complete tree is built.Note about pprof symbolication w/o truncation: it is 4-5 times faster than tree, but allocates twice more memory (on average).
After certain point, truncation becomes pretty much senseless, however we still need to make the profile lighter for the frontend.
Performance aspect is crucial here, because the symbolication process is synchronous and is not parallelized (which is another issue). In pathological cases this stage can make a noticeable contribution to the duration of the query: