Skip to content


Subversion checkout URL

You can clone with
Download ZIP
tree: baf2dc64c1
Fetching contributors…

Cannot retrieve contributors at this time

704 lines (510 sloc) 21.065 kb
XHProf Documentation: Rough Draft (Work in progress)
<h3>XHProf Documentation (Rough Draft/Work in progress)</h3>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#overview">XHProf Overview</a></li>
<li><a href="#installation">Installing the XHProf extension</a></li>
<li><a href="#using_extension">Profiling using XHProf</a></li>
<li><a href="#ui_setup">Setting up the XHProf UI</a></li>
<li><a href="#sampling_mode">Lightweight Sampling Mode</a>
<li><a href="#misc">Additional features</a></li>
<li><a href="#dependencies">Dependencies</a></li>
<li><a href="#credits">Acknowledgements</a></li>
<li><a name="introduction"><h2>Introduction</h2></a>
<p>XHProf is a function-level hierarchical profiler for PHP. It is
capable of reporting function-level <a href="#inclusive">inclusive</a>
and <a href="#exclusive">exclusive</a> wall (elapsed) times, memory
usage and CPU times as well as number of calls to each function.
<p>The raw data collection component is implemented in C as a PHP Zend
extension called <code><b>xhprof</b></code>.
<p>The UI layer is implemented in PHP. It is a simple HTML based
navigational interface. The browser based UI for viewing profiler
results makes it easy to view results or to share results with peers.
A callgraph image view is also supported.
<p>XHProf keeps track of 1-level of function calling context during
the raw data gathering phase, and in the analysis/reporting phase is
able to compute function "exclusive" times in addition to break down
of cost by parent/children functions (e.g, what was the cost of bar()
when called from foo(), or who all the callers of bar() are, and so
<p>XHProf supports ability to compare two runs (a.k.a. "diff"
reports). Diff reports, much like single run reports, offer "flat" as
well as "hierarchical" views.
<p>It correctly accounts for recursive function calls. It also tracks
"include", "include_once", "require", "require_once" operations as if
they were functions.
<p>XHProf can often be helpful in understanding the structure of the
code being executed. The hierarchical nature of the reports can be
used to determine, for example, what chain of calls led to a
particular function getting called.
<p> XHProf also supports ability to aggregate raw data from multiple
runs into a single run. This aggregation capability is quite useful
for building system-wide performance monitoring tools.
<p>XHProfLive (not part of the open source kit), for example, is a
system-wide performance monitoring system in use at Facebook that is
built on top of XHProf. XHProfLive continually gathers function-level
profiler data from production tier by running a sample of page
requests under XHProf. XHProfLive then aggregates the profile data
corresponding to individual requests by various dimensions such a
time, page type, and can help answer a variety of questions such as:
What is the function-level profile for a specific page? How expensive
is function "foo" across all pages, or on a specific page? What
functions regressed most since 7 days ago? What is the historical
trend for execution time of a page/function? and so on.
<p>Originally developed at Facebook, XHProf was open sourced in Mar, 2009.</p>
<li><a name="overview"><h2>XHProf Overview</h2></a>
<p>XHProf provides:
<li><b>Flat profile</b> (<a href="sample-flat-view.jpg" >screenshot</a>)
<p>Provides function-level summary information such number of calls,
inclusive/exclusive wall time, memory usage, and CPU time.
<li><b>Hierarchical profile (Parent/Child View)</b>
(<a href="sample-parent-child-view.jpg" >screenshot</a>)
<p>For each function, it provides a breakdown of calls and times per
parent (caller) & child (callee), such as:
<li> what functions call a particular function and how many times?
<li> what functions does a particular function call?
<li> The total time spent under a function when called from a particular parent.
<li><b>Diff Reports</b>
<p>You may want to compare data from two XHProf runs for various
reasons-- to figure out what's causing a regression between one
version of the code base to another, to evaluate the performance
improvement of a code change you are making, and so on.
<p>A diff report takes two runs as input and provides both flat
function-level diff information, and hierarchical information
(breakdown of diff by parent/children functions) for each function.
<p>The "flat" view (<a href="sample-diff-report-flat-view.jpg"
>sample screenshot</a>) in the diff report points out the top
regressions & improvements.
<p>Clicking on functions in the "flat" view of the diff report, leads
to the "hierarchical" (or parent/child) diff view of a function (<a href="sample-diff-report-parent-child-view.jpg"
>sample screenshot</a>). We can get a
breakdown of the diff by parent/children functions.
<li><b>Callgraph View</b> (<a href="sample-callgraph-image.jpg"
>sample screenshot</a>)
<p>The profile data can also be viewed as a callgraph. The callgraph
view highlights the critical path of the program.
<li><b>Memory Profile</b>
<p>XHProf's memory profile mode helps track functions that
allocate lots of memory.
<p>It is worth clarifying that that XHProf doesn't strictly track each
allocation/free operation. Rather it uses a more simplistic
scheme. It tracks the increase/decrease in the amount of memory
allocated to PHP between each function's entry and exit. It also
tracks increase/decrease in the amount of <b>peak</b> memory allocated to
PHP for each function.
<a name="Terminology"></a><h2>Terminology</h2>
<a name="inclusive"></a><li><b>Inclusive Time (or Subtree Time)</b>:
Includes time spent in the function as well as in descendant functions
called from a given function.
<a name="exclusive"></a><li><b>Exclusive Time/Self Time</b>: Measures
time spent in the function itself. Does not include time in descendant
<li><b>Wall Time</b>: a.k.a. Elapsed time or wall clock time.
<li><b>CPU Time</b>: CPU time in user space + CPU time in kernel space
<a name="Naming_convention_for_special_functions"></a><h2>Naming convention for special functions</h2>
<p><li><code><b>main()</b></code>: a fictitious function that is at the root of the call graph.
<p><li><code><b>load::&lt;filename&gt;</b></code> and <code><b>run_init::&lt;filename&gt;</b></code>:
<p>XHProf treats PHP <code>include/require</code> operations treated
as function calls.
<p>For example, an <b>include "lib/common.php";</b> operation will
result in two XHProf function entries:
<li> <code><b>load::lib/common.php</b></code> - This represents the work done by the
interpreter to compile/load the file. [Note: If you are using a PHP
opcode cache like APC, then the compile only happens on a cache miss
in APC.]
<li> <code><b>run_init::lib/common.php</b></code> - This represents
initialization code executed at the file scope as a result of the
include operation.
<p><li><code><b>foo@&lt;n&gt;</b></code>: Implies that this is a
recursive invokation of <code>foo()</code>, where <code>&lt;n&gt;</code> represents
the recursion depth. The recursion may be direct (such as due to
<code>foo()</code> --&gt; <code>foo()</code>), or indirect (such as
due to </code>foo()</code> --&gt; <code>goo()</code> --&gt; foo()).
<a name="Limitations"></a><h2>Limitations</h2>
<p>True hierarchical profilers keep track of a full call stack at
every data gathering point, and are later able to answer questions
like: what was the cost of the 3rd invokation of foo()? or what was
the cost of bar() when the call stack looked like
<p>XHProf keeps track of only 1-level of calling context and is
therefore only able to answer questions about a function looking
either 1-level up or 1-level down. It turns out that in practice this
is sufficient for most use cases.
<p>To make this more concrete, take for instance the following
Say you have:
1 call from a() --&gt; c()
1 call from b() --&gt; c()
50 calls from c() --&gt; d()
<p>While XHProf can tell you that d() was called from c() 50 times, it
cannot tell you how many of those calls were triggered due to a()
vs. b(). [We could speculate that perhaps 25 were due to a() and 25
due to b(), but that's not necessarily true.]
<p>In practice however, this isn't a very big limitation. And, for
development mode, we can always enhance XHProf to keep more and more
of the stack if this is deemed to be a severe restriction.
<li><a name="installation"><h2>Installing the XHProf Extension</h2></a>
<p> The extension lives in the "extension/" sub-directory. The steps
below should work for Linux/Unix environments.
% cd &lt;xhprof_source_directory&gt;/extension/
% phpize
% ./configure --with-php-config=&lt;path to php-config&gt;
% make
% make install
% make test
<p><b>Note:</b> XHProf uses an RDTSC (time stamp counter) based
implementation. So at the moment, it only works on x86 architecture.
<p><a name="ini_file"></a><b>php.ini file</b>: You can update your
php.ini file to automatically load your extension. Add the following
to your php.ini file.
# directory used by default implementation of the iXHProfRuns
# interface (namely, the XHProfRuns_Default class) for storing
# XHProf runs.
<li><a name="using_extension"><h2>Profiling using XHProf</h2></a>
<p>Test generating raw profiler data using a sample test program like:
function bar($x) {
if ($x > 0) {
bar($x - 1);
function foo() {
for ($idx = 0; $idx < 2; $idx++) {
$x = strlen("abc");
// start profiling
// run program
// stop profiler
<b>$xhprof_data = xhprof_disable();</b>
// display raw xhprof data for the profiler run
<p><b>Run the above test program:</b>
% php foo.php
<p><b>You should get an output like:</b>
[foo==>bar] => Array
[ct] => 2 # 2 calls to bar() from
[wt] => 27 # inclusive time in bar() when called from foo()
[foo==>strlen] => Array
[ct] => 2
[wt] => 2
[bar==>bar@1] => Array # a recursive call to bar()
[ct] => 1
[wt] => 2
[main()==>foo] => Array
[ct] => 1
[wt] => 74
[main()==>xhprof_disable] => Array
[ct] => 1
[wt] => 0
[main()] => Array # fake symbol representing root
[ct] => 1
[wt] => 83
<b>Note:</b> The raw data only contains "inclusive" metrics. For
example, the wall time metric in the raw data represents inclusive
time in microsecs. Exclusive times for any function are computed
during the analysis/reporting phase.
<b>Note:</b> By default only call counts & elapsed time is profiled.
You can optionally also profile CPU time and/or memory usage. Replace,
in the above program with, for example:
<p><b>You should now get an output like:</b>
[foo==>bar] => Array
[ct] => 2 # number of calls to bar() from foo()
[wt] => 37 # time in bar() when called from foo()
[cpu] => 0 # cpu time in bar() when called from foo()
[mu] => 2208 # change in PHP memory usage in bar() when called from foo()
[pmu] => 0 # change in PHP peak memory usage in bar() when called from foo()
[foo==>strlen] => Array
[ct] => 2
[wt] => 3
[cpu] => 0
[mu] => 624
[pmu] => 0
[bar==>bar@1] => Array
[ct] => 1
[wt] => 2
[cpu] => 0
[mu] => 856
[pmu] => 0
[main()==>foo] => Array
[ct] => 1
[wt] => 104
[cpu] => 0
[mu] => 4168
[pmu] => 0
[main()==>xhprof_disable] => Array
[ct] => 1
[wt] => 1
[cpu] => 0
[mu] => 344
[pmu] => 0
[main()] => Array
[ct] => 1
[wt] => 139
[cpu] => 0
[mu] => 5936
[pmu] => 0
<p>By default PHP builtin functions (such as <code>strlen</code>) are
profiled. If you do not want to profile builtin functions (to either
reduce the overhead of profiling further or size of generated raw
data), you can use the <code><b>XHPROF_FLAGS_NO_BUILTINS</b></code>
flag as in for example:
<li><a name="ui_setup"><h2>Setting up XHProf UI</h2></a>
<li><b>PHP source structure</b>
<p>The XHProf UI is implemented in PHP. The code resides in two
subdirectories, <code>xhprof_html/</code> and <code>xhprof_lib/</code>.
<p>The <code>xhprof_html</code> directory contains the 3 top-level PHP pages.
<li><code>index.php</code>: For view a XHProf run or diff report.
<li><code>callgraph.php</code>: For viewing a callgraph of a XHProf run as an image.
<li><code>typeahead.php</code>: Used implicitly for the function typeahead form
on a XHProf report.
<p>The <code>xhprof_lib</code> directory contains supporting code for
display as well as analysis (computing flat profile info, computing
diffs, aggregating data from multiple runs, etc.).
<li><p><b>Web server config: </b> You'll need to make sure that the
<code>xhprof_html/</code> directory is accessible from your web server, and that
your web server is setup to serve PHP scripts.
<li><p><b>Managing XHProf Runs</b>
<p>Clients have flexibility in how they save the XHProf raw data
obtained from an XHProf run. The XHProf UI layer exposes an interface
iXHProfRuns (see xhprof_lib/utils/xhprof_runs.php) that clients can
implement. This allows the clients to tell the UI layer how to fetch
the data corresponding to a XHProf run.
<p>The XHProf UI libaries come with a default file based
implementation of the iXHProfRuns interface, namely
"XHProfRuns_Default" (also in xhprof_lib/utils/xhprof_runs.php).
This default implementation stores runs in the directory specified by
<a href="#ini_file"><b>xhprof.output_dir</b></a> INI parameter.
<p>A XHProf run must be uniquely identified by a "namespace/type" and a run
<p><b>a) Saving XHProf data persistently</b>:
<p>Assuming you are using the default implementation
<code><b>XHProfRuns_Default</b></code> of the
<code><b>iXHProfRuns</b></code> interface, a typical XHProf run
followed by the save step might look something like:
// start profiling
// run program
// stop profiler
$xhprof_data = xhprof_disable();
// Saving the XHProf run
// using the default implementation of iXHProfRuns.
include_once $XHPROF_ROOT . "/xhprof_lib/utils/xhprof_lib.php";
include_once $XHPROF_ROOT . "/xhprof_lib/utils/xhprof_runs.php";
$xhprof_runs = new <b>XHProfRuns_Default()</b>;
// Save the run under a namespace "xhprof_foo".
// **NOTE**:
// By default save_run() will automatically generate a unique
// run id for you. But you can optionally pass it a unique run
// to the save_run() method.
<b>$run_id = $xhprof_runs->save_run($xhprof_data, "xhprof_foo");</b>
echo "---------------\n".
"Assuming you have set up the http based UI for \n".
"XHProf at some address, you can view run at \n".
<p>The above should save the run as a file in the directory specified
by the <code><b>xhprof.output_dir</b></code> INI parameter. The file's
name might be something like
<b><code>49bafaa3a3f66.xhprof_foo</code></b>; the two parts being the
run id ("49bafaa3a3f66") and the namespace ("xhprof_foo"). [If you
want to create/assign run ids yourself (such as a database sequence
number, or a timestamp), you can explicitly pass in the run id to the
<code>save_run</code> method.
<p><b>b) Using your own implementation of iXHProfRuns</b>
<p> If you decide you want your XHProf runs to be stored differently
(either in a compressed format, in an alternate place such as DB,
etc.) database, you'll need to implement a class that implements the
iXHProfRuns() interface.
<p> You'll also need to modify the 3 main PHP entry pages (index.php,
callgraph.php, typeahead.php) in the "xhprof_html/" directory to use
the new class instead of the default class <code>XHProfRuns_Default</code>.
Change this line in the 3 files.
$xhprof_runs_impl = new XHProfRuns_Default();
<p>You'll also need to "include" the file that implements your class in
the above files.
<li><p><b>Accessing runs from UI</b>
<p><b>a) Viewing a Single Run Report</b>
<p>To view the report for run id say &lt;run_id&gt; and namespace
&lt;namespace&gt; use a URL of the form:
<p>For example,
<p><b>b) Viewing a Diff Report</b>
<p>To view the report for run ids say &lt;run_id1&gt; and
&lt;run_id2&gt; in namespace &lt;namespace&gt; use a URL of the form:
<p><b>c) Aggregate Report</b>
<p>You can also specify a set of run ids for which you want an aggregated view/report.
<p>Say you have three XHProf runs with ids 1, 2 & 3 in namespace
"benchmark". Further suppose that these represent runs three types of
programs p1.php, p2.php and p3.php that typically occur in a mix of
20%, 30%, 50% respectively. To view an aggregate report that
corresponds to a weighted average of these runs using:
<li><a name="sampling_mode"><h2>Lightweight Sampling Mode</h2></a>
<p>The xhprof extension also provides a very light weight <b>sampling
mode</b>. The sampling interval is 0.1 secs. Samples record the full
function call stack. The sampling mode can be useful if an extremely
low overhead means of doing performance monitoring and diagnostics is
<p> The relevant functions exposed by the extension for using the
sampling mode are <code><b>xhprof_sample_enable()</b></code> and
<p>[<b>TBD</b>: more detailed documentation on sampling mode.]
<li><a name="misc"><h2>Additional Features</h2></a></li>
<p>The <code><b>xhprof_lib/utils/xhprof_lib.php</b></code> file contains
additional library functions that can be used for manipulating/
aggregating XHProf runs.
<p>For example:
<p><li><code><b>xhprof_aggregate_runs()</b></code>: To aggreated multiple
XHProf runs into a single run. This can be helpful for building a
system-wide "function-level" performance monitoring tool using
XHProf. [For example, you might to roll up XHProf runs sampled from
production periodically to generate hourly, daily, reports.]
<p><li><code><b>xhprof_prune_run()</b></code>: Aggregating large number of
XHProf runs (especially if they correspond to different types of
programs) can result in the callgraph size becoming too large. You can
use <code>xhprof_prune_run</code> function to prune the callgraph data
by editing out subtrees that account for a very small portion of the
total time.
<li><a name="dependencies"><h2>Dependencies</h2></a></li>
<li><b>JQuery Javascript</b>: For tooltips and function name
typeahead, we make use of JQuery's javascript libraries. JQuery is
available under both a MIT and GPL licencse
( The relevant JQuery code, used by
XHProf, is in the <code>xhprof_html/jquery</code> subdirectory.
<li><b>dot (image generation utility):</b> The callgraph image
visualization ([View Callgraph]) feature relies on the presence of
Graphviz "dot" utility in your path. "dot" is a utility to
draw/generate an image for a directed graph.
<li><a name="usage"><h2>Acknowledgements</h2></a>
<p>The HTML-based navigational interface for browsing profiler results
is inspired by that of a similar tool that exists for Oracle's stored
procedure language, PL/SQL. But that's where the similarity ends; the
internals of the profiler itself are quite different.
Jump to Line
Something went wrong with that request. Please try again.