Does imprecise sample frequency affect correctness of flame graph analysis? #209

open-richard · 2019-07-07T00:48:45Z

Hi Brendan,

I am a huge fan of Flamegraph and all your blogs and talks. But I have a question on the correctness of flame graph.
According to the source code, the flamegraph generation is totally relying on the sample counts of the stack traces. However, as experiments show when the system is relatively idle, the sampling frequency is not matching what we specify in the command line arguments (-F 99).

I checked out the tutorial of perf and got to know that -F means the kernel is dynamically adjusting the sampling period to achieve the target average rate. But if we look at the data, the actual sample frequency can be only around 60-70Hz on average when I am specifying a 99Hz sample rate. Each sample has a different length of period. Some of the long-lasting samples have a period longer than 30ms* while I am letting it collect one sample every 10ms. Meanwhile, some can be as short as less than 1ms.

With this kind of variation on sample duration, how can I know whether counting samples is correctly reflecting how much time a function is taking?

In practice, from the result of flamegraph, it does show the approximate portion. But I am confused why this kind of largely imprecise sample rate can be reliable? I am curious about how you think about this potentially wrong approximation.

And what could be the best way to guarantee a precise sample rate with perf? Apparently, neither -F nor -c can ensure one sample per 10ms. Is there a way?

Thank you very much!

best,
Richard

p.s.

* 30ms. Here I am using the CPU frequency to get on average how much time each cycle takes (e.g., 1/2.4GHz), and multiply the cycle counts in the period. From the documentation, I know "cycle" is not constant correlated to time, but I think I can use the CPU cycle time as a lower bound of the time this sample lasts. (please correct me if I am wrong here)
Sample count was obtained by running perf report -D |grep RECORD_SAMPLE |wc -l
The experiment was performed in a single core virtual machine with Ubuntu 16.04 kernel 4.13.0
When the system is under heavy load, the sample frequency is correct, i.e., getting 99 samples per second with -F 99.
I wanted to send this as a personal email, but I think github will make this question searchable to people who have the same question.

The text was updated successfully, but these errors were encountered:

Knio · 2020-09-21T22:55:18Z

This looks related to #165. Using cpu-clock instead of cycles should give you a more consistent sampling period.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does imprecise sample frequency affect correctness of flame graph analysis? #209

Does imprecise sample frequency affect correctness of flame graph analysis? #209

open-richard commented Jul 7, 2019

Knio commented Sep 21, 2020

Does imprecise sample frequency affect correctness of flame graph analysis? #209

Does imprecise sample frequency affect correctness of flame graph analysis? #209

Comments

open-richard commented Jul 7, 2019

Knio commented Sep 21, 2020