forked from brendangregg/FlameGraph
/
README
60 lines (35 loc) · 2.14 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Flame Graphs visualize hot-CPU code-paths.
See: http://dtrace.org/blogs/brendan/2011/12/16/flame-graphs/
These can be created in three steps:
1. Capture stacks
2. Fold stacks
3. flamegraph.pl
1. Capture stacks
Stack samples can be captured using DTrace, perf_events or SystemTap.
Using DTrace to capture 60 seconds of kernel stacks at 997 Hertz:
# dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count(); } tick-60s { exit(0); }' -o out.kern_stacks
Using DTrace to capture 60 seconds of user-level stacks for PID 12345 at 997 Hertz:
# dtrace -x ustackframes=100 -n 'profile-997 /PID == 12345/ { @[ustack()] = count(); } tick-60s { exit(0); }' -o out.user_stacks
You can include "arg1" in the predicate if you just want to study user-level CPU time only; otherwise it includes kernel time in the user-level flame graph (eg, syscall time). (Make sure you understand what that means!)
2. Fold stacks
Use the stackcollapse programs to fold stack samples into single lines. The programs provided are:
- stackcollapse.pl: for DTrace stacks
- stackcollapse-perf.pl: for perf_events "perf script" output
- stackcollapse-stap.pl: for SystemTap stacks
Usage example:
$ ./stackcollapse.pl out.kern_stacks > out.kern_folded
The output looks like this:
unix`_sys_sysenter_post_swapgs 1401
unix`_sys_sysenter_post_swapgs,genunix`close 5
unix`_sys_sysenter_post_swapgs,genunix`close,genunix`closeandsetf 85
unix`_sys_sysenter_post_swapgs,genunix`close,genunix`closeandsetf,c2audit`audit_closef 26
unix`_sys_sysenter_post_swapgs,genunix`close,genunix`closeandsetf,c2audit`audit_setf 5
unix`_sys_sysenter_post_swapgs,genunix`close,genunix`closeandsetf,genunix`audit_getstate 6
unix`_sys_sysenter_post_swapgs,genunix`close,genunix`closeandsetf,genunix`audit_unfalloc 2
unix`_sys_sysenter_post_swapgs,genunix`close,genunix`closeandsetf,genunix`closef 48
[...]
3. flamegraph.pl
Use flamegraph.pl to render a SVG.
$ ./flamegraph.pl out.kern_folded > kernel.svg
An advantage of having the folded input file (and why this is separate to flamegraph.pl) is that you can use grep for functions of interest. Eg:
$ grep cpuid out.kern_folded | ./flamegraph.pl > cpuid.svg