Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
HHVM has several options for profiling your PHP (and Hack!) code and figuring out where it's spending its time and optimizing it.
You may want to also look into tuning HHVM itself.
XHProf / Hotprofiler
HHVM natively supports the xhprof extension. This is good for producing parent-child pairs of function calls. The data collected are inclusive wall and CPU times. There are some libraries available for manipulating this data.
Although the earliest and least fancy of tools in Facebook's historical performance tuning story, this got us an extremely long way. We'd record 1:N requests, throw them into a DB, and have offline scripts (similar to those provided in the library above) aggregate those into 10m rollups per endpoint, then aggregate those into 1h rollups, 1d rollups, etc. It's not in wide use at FB to this day only since its notion of "function call" doesn't deal well with Hack's async functions.
You can use the Linux
perf tool to look at HHVM performance. HHVM can spit out metadata for
perf to use so that its stack traces can look into even JITted code. This is extremely good for profiling HHVM itself (as opposed to the code running on it), as well as looking into complex systems interactions. However, since it's kernel-based, getting any sort of PHP metadata, such as even which endpoint a trace is running, is very difficult.
This work with the standard Linux
perf tool. For a quick intro, assuming
$PID is the pid of HHVM:
- To collect data, run
perf record -g --call-graph dwarf -p $PID. This works best on a running, warmed-up, release version of HHVM. It will write to a file
perf.datain the current directory, and keep adding samples until you press Control-C. (The
--call-graph dwarfoption in particular can cause this file to balloon very quickly, but we need that option since HHVM is built with
- To view the data, run
perf report -gfrom the same directory. This will show a list of the functions where the most CPU time was spent, and you can press enter on any function to look at its callers, and the proportion of the time that came from each caller.
- PHP functions should show up with symbolic names in the backtraces. If they do not, i.e., if you see a bunch of hex addresses instead, there can be a few reasons for this.
- Make sure there is a file
/tmp/perf-$PID.map. This is written by HHVM to tell
perf reportwhat JIT'd code corresponds to what source code. If the file is missing, either you're on a different machine than when you did the
perf record, or HHVM cleaned up the file. The INI option
hhvm.keep_perf_pid_map=1will tell HHVM not to clean up this file.
- If the file is there but
perf reportdoesn't seem to be respecting it, some versions of the
perftool have a bug where they do the wrong thing with the maps for code JIT'd into the heap (i.e., where HHVM puts it). The version of
perfthat ships with Ubuntu 14.04 (
3.13.11-ckt18) is known to be broken in this way. You may have to upgrade your version of the tool; to build it yourself, it can be found in the Linux kernel sources under
4.0.2is known to work on Ubuntu 14.04, newer versions will probably work too. You can try using http://dl.hhvm.com/resources/perf.gz which should work on Ubuntu 14.04, YMMV on other OSes.
- Make sure there is a file
- There's a ton more this tool can do; see the various bits of documentation and examples for the Linux perf tools available online.
The newest of the tools. A configuration option controls how often a snapshot of the current execution backtrace is taken (usually every few minutes). PHP code, usually a shutdown function, can then look and see if that request had a profile taken, and if so write it to a DB. In aggregate, these "flashes" of profiling information tell which functions it's more likely execution time will be spent, i.e., which functions are more expensive.
There are a bunch of things you can do with this data, especially since it gives a full backtrace, and the PHP shutdown function logging it can provide arbitrary metadata with the trace. Facebook has some fancy visualization tools to walk up and down aggregated traces, which are sadly not open source. Wikimedia also has some tooling built around Xenon, to aggregate its traces into flame graphs.
Because it's a flash profiler, the data collected are all effectively about wall time. You might want to filter out IO heavy functions.