Permalink
Browse files

example, and some development code

  • Loading branch information...
1 parent 5b8b413 commit a8935829dfd2d57720e32e96b9d2806551d29f16 Brendan Gregg committed Mar 28, 2013
Showing with 43,563 additions and 4 deletions.
  1. +15 −4 README
  2. BIN dev/.README.txt.swp
  3. +8 −0 dev/README
  4. +32 −0 dev/gatherhc-kern.d
  5. +87 −0 dev/hcstackcollapse.pl
  6. +267 −0 dev/hotcoldgraph.pl
  7. +41,913 −0 example-stacks.txt
  8. +1,241 −0 example.svg
View
@@ -12,7 +12,7 @@ These can be created in three steps:
1. Capture stacks
-
+=================
Stack samples can be captured using DTrace, perf_events or SystemTap.
Using DTrace to capture 60 seconds of kernel stacks at 997 Hertz:
@@ -30,7 +30,7 @@ Using DTrace to capture 60 seconds of user-level stacks, including while time is
Switch ustack() for jstack() if the application has a ustack helper to include translated frames (eg, node.js frames; see: http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-its-time/). The rate for user-level stack collection is deliberately slower than kernel, which is especially important when using jstack() as it performs additional work to translate frames.
2. Fold stacks
-
+==============
Use the stackcollapse programs to fold stack samples into single lines. The programs provided are:
- stackcollapse.pl: for DTrace stacks
@@ -54,13 +54,24 @@ unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;genunix`audit_
unix`_sys_sysenter_post_swapgs;genunix`close;genunix`closeandsetf;genunix`closef 48
[...]
-
3. flamegraph.pl
-
+================
Use flamegraph.pl to render a SVG.
$ ./flamegraph.pl out.kern_folded > kernel.svg
An advantage of having the folded input file (and why this is separate to flamegraph.pl) is that you can use grep for functions of interest. Eg:
$ grep cpuid out.kern_folded | ./flamegraph.pl > cpuid.svg
+
+
+Provided Example
+================
+An example output from DTrace is included, both the captured stacks and
+the resulting Flame Graph. You can generate it yourself using:
+
+$ ./stackcollapse.pl example-stacks.txt | ./flamegraph.pl > example.svg
+
+This was from a particular performance investigation: the Flame Graph
+identified that CPU time was spent in the lofs module, and quantified
+that time.
View
Binary file not shown.
View
@@ -0,0 +1,8 @@
+EXPERIMENTAL: This includes some work in progress code, which may not work
+properly.
+
+Hot Cold Graphs
+===============
+These show both on-CPU time (in shades of red) and off-CPU time (blocked time;
+in shades of blue) in a similar style to the Flame Graph.
+
View
@@ -0,0 +1,32 @@
+#!/usr/sbin/dtrace -s
+
+#pragma D option stackframes=100
+#pragma D option defaultargs
+
+profile:::profile-999
+/arg0/
+{
+ @[stack(), 1] = sum(1000);
+}
+
+sched:::off-cpu
+{
+ self->start = timestamp;
+}
+
+sched:::on-cpu
+/(this->start = self->start)/
+{
+ this->delta = (timestamp - this->start) / 1000;
+ @[stack(), 0] = sum(this->delta);
+ self->start = 0;
+}
+
+profile:::tick-60s,
+dtrace:::END
+{
+ normalize(@, 1000);
+ printa("%koncpu:%d ms:%@d\n", @);
+ trunc(@);
+ exit(0);
+}
@@ -0,0 +1,87 @@
+#!/usr/bin/perl -w
+#
+# hcstackcolllapse.pl collapse hot/cold multiline stacks into single lines.
+#
+# EXPERIMENTAL: This is a work in progress, and may not work properly.
+#
+# Parses a multiline stack followed by oncpu status and ms on a separate line
+# (see example below) and outputs a comma separated stack followed by a space
+# and the number. If memory addresses (+0xd) are present, they are stripped,
+# and resulting identical stacks are colased with their counts summed.
+#
+# USAGE: ./hcstackcollapse.pl infile > outfile
+#
+# Example input:
+#
+# mysqld`_Z10do_commandP3THD+0xd4
+# mysqld`handle_one_connection+0x1a6
+# libc.so.1`_thrp_setup+0x8d
+# libc.so.1`_lwp_start
+# oncpu:1 ms:2664
+#
+# Example output:
+#
+# libc.so.1`_lwp_start,libc.so.1`_thrp_setup,mysqld`handle_one_connection,mysqld`_Z10do_commandP3THD oncpu:1 ms:2664
+#
+# Input may contain many stacks, and can be generated using DTrace. The
+# first few lines of input are skipped (see $headerlines).
+#
+# Copyright 2013 Joyent, Inc. All rights reserved.
+# Copyright 2013 Brendan Gregg. All rights reserved.
+#
+# CDDL HEADER START
+#
+# The contents of this file are subject to the terms of the
+# Common Development and Distribution License (the "License").
+# You may not use this file except in compliance with the License.
+#
+# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
+# or http://opensource.org/licenses/CDDL-1.0.
+# See the License for the specific language governing permissions
+# and limitations under the License.
+#
+# When distributing Covered Code, include this CDDL HEADER in each
+# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
+# If applicable, add the following below this CDDL HEADER, with the
+# fields enclosed by brackets "[]" replaced with your own identifying
+# information: Portions Copyright [yyyy] [name of copyright owner]
+#
+# CDDL HEADER END
+#
+# 14-Aug-2011 Brendan Gregg Created this.
+
+use strict;
+
+my %collapsed;
+my $headerlines = 2;
+
+sub remember_stack {
+ my ($stack, $oncpu, $count) = @_;
+ $collapsed{"$stack $oncpu"} += $count;
+}
+
+my $nr = 0;
+my @stack;
+
+foreach (<>) {
+ next if $nr++ < $headerlines;
+ chomp;
+
+ if (m/^oncpu:(\d+) ms:(\d+)$/) {
+ remember_stack(join(",", @stack), $1, $2) unless $2 == 0;
+ @stack = ();
+ next;
+ }
+
+ next if (m/^\s*$/);
+
+ my $frame = $_;
+ $frame =~ s/^\s*//;
+ $frame =~ s/\+.*$//;
+ $frame = "-" if $frame eq "";
+ unshift @stack, $frame;
+}
+
+foreach my $k (sort { $a cmp $b } keys %collapsed) {
+ printf "$k $collapsed{$k}\n";
+}
Oops, something went wrong.

0 comments on commit a893582

Please sign in to comment.