Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: add utility to easily profile node performance with perf #14519

Merged
merged 2 commits into from Feb 5, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
46 changes: 46 additions & 0 deletions doc/developer-notes.md
Expand Up @@ -17,6 +17,7 @@ Developer Notes
- [DEBUG_LOCKORDER](#debug_lockorder)
- [Valgrind suppressions file](#valgrind-suppressions-file)
- [Compiling for test coverage](#compiling-for-test-coverage)
- [Performance profiling with perf](#performance-profiling-with-perf)
- [Locking/mutex usage notes](#lockingmutex-usage-notes)
- [Threads](#threads)
- [Ignoring IDE/editor files](#ignoring-ideeditor-files)
Expand Down Expand Up @@ -257,6 +258,51 @@ make cov
# A coverage report will now be accessible at `./test_bitcoin.coverage/index.html`.
```

### Performance profiling with perf

Profiling is a good way to get a precise idea of where time is being spent in
code. One tool for doing profiling on Linux platforms is called
[`perf`](http://www.brendangregg.com/perf.html), and has been integrated into
the functional test framework. Perf can observe a running process and sample
(at some frequency) where its execution is.

Perf installation is contingent on which kernel version you're running; see
[this StackExchange
thread](https://askubuntu.com/questions/50145/how-to-install-perf-monitoring-tool)
for specific instructions.

Certain kernel parameters may need to be set for perf to be able to inspect the
running process' stack.

```sh
$ sudo sysctl -w kernel.perf_event_paranoid=-1
$ sudo sysctl -w kernel.kptr_restrict=0
```

Make sure you [understand the security
trade-offs](https://lwn.net/Articles/420403/) of setting these kernel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "docs: add perf section to developer docs" (27a895d72a385bcaa861e08b03bb5c817f8e253f)

Slightly strange the link text is "understand the security tradeoffs" instead of just "security tradeoffs."

parameters.

To profile a running bitcoind process for 60 seconds, you could use an
invocation of `perf record` like this:

```sh
$ perf record \
-g --call-graph dwarf --per-thread -F 140 \
-p `pgrep bitcoind` -- sleep 60
```

You could then analyze the results by running

```sh
perf report --stdio | c++filt | less
```

or using a graphical tool like [Hotspot](https://github.com/KDAB/hotspot).

See the functional test documentation for how to invoke perf within tests.


**Sanitizers**

Bitcoin Core can be compiled with various "sanitizers" enabled, which add
Expand Down
20 changes: 20 additions & 0 deletions test/README.md
Expand Up @@ -176,6 +176,26 @@ gdb /home/example/bitcoind <pid>

Note: gdb attach step may require `sudo`

##### Profiling

An easy way to profile node performance during functional tests is provided
for Linux platforms using `perf`.

Perf will sample the running node and will generate profile data in the node's
datadir. The profile data can then be presented using `perf report` or a graphical
tool like [hotspot](https://github.com/KDAB/hotspot).

To generate a profile during test suite runs, use the `--perf` flag.

To see render the output to text, run

```sh
perf report -i /path/to/datadir/send-big-msgs.perf.data.xxxx --stdio | c++filt | less
```

For ways to generate more granular profiles, see the README in
[test/functional](/test/functional).

### Util tests

Util tests can be run locally by running `test/util/bitcoin-util-test.py`.
Expand Down
33 changes: 33 additions & 0 deletions test/functional/README.md
Expand Up @@ -118,3 +118,36 @@ Helpers for script.py

#### [test_framework/blocktools.py](test_framework/blocktools.py)
Helper functions for creating blocks and transactions.

### Benchmarking with perf
jamesob marked this conversation as resolved.
Show resolved Hide resolved

An easy way to profile node performance during functional tests is provided
for Linux platforms using `perf`.

Perf will sample the running node and will generate profile data in the node's
datadir. The profile data can then be presented using `perf report` or a graphical
tool like [hotspot](https://github.com/KDAB/hotspot).

There are two ways of invoking perf: one is to use the `--perf` flag when
running tests, which will profile each node during the entire test run: perf
begins to profile when the node starts and ends when it shuts down. The other
way is the use the `profile_with_perf` context manager, e.g.

```python
with node.profile_with_perf("send-big-msgs"):
# Perform activity on the node you're interested in profiling, e.g.:
for _ in range(10000):
node.p2p.send_message(some_large_message)
```

To see useful textual output, run

```sh
perf report -i /path/to/datadir/send-big-msgs.perf.data.xxxx --stdio | c++filt | less
```

#### See also:

- [Installing perf](https://askubuntu.com/q/50145)
- [Perf examples](http://www.brendangregg.com/perf.html)
- [Hotspot](https://github.com/KDAB/hotspot): a GUI for perf output analysis
16 changes: 14 additions & 2 deletions test/functional/test_framework/test_framework.py
Expand Up @@ -128,6 +128,8 @@ def main(self):
help="Attach a python debugger if test fails")
parser.add_argument("--usecli", dest="usecli", default=False, action="store_true",
help="use bitcoin-cli instead of RPC for all commands")
parser.add_argument("--perf", dest="perf", default=False, action="store_true",
help="profile running nodes with perf for the duration of the test")
self.add_options(parser)
self.options = parser.parse_args()

Expand Down Expand Up @@ -201,11 +203,20 @@ def main(self):
node.cleanup_on_exit = False
self.log.info("Note: bitcoinds were not stopped and may still be running")

if not self.options.nocleanup and not self.options.noshutdown and success != TestStatus.FAILED:
should_clean_up = (
not self.options.nocleanup and
not self.options.noshutdown and
success != TestStatus.FAILED and
not self.options.perf
)
if should_clean_up:
self.log.info("Cleaning up {} on exit".format(self.options.tmpdir))
cleanup_tree_on_exit = True
elif self.options.perf:
self.log.warning("Not cleaning up dir {} due to perf data".format(self.options.tmpdir))
cleanup_tree_on_exit = False
else:
self.log.warning("Not cleaning up dir %s" % self.options.tmpdir)
self.log.warning("Not cleaning up dir {}".format(self.options.tmpdir))
cleanup_tree_on_exit = False

if success == TestStatus.PASSED:
Expand Down Expand Up @@ -309,6 +320,7 @@ def add_nodes(self, num_nodes, extra_args=None, *, rpchost=None, binary=None):
extra_conf=extra_confs[i],
extra_args=extra_args[i],
use_cli=self.options.usecli,
start_perf=self.options.perf,
))

def start_node(self, i, *args, **kwargs):
Expand Down
98 changes: 97 additions & 1 deletion test/functional/test_framework/test_node.py
Expand Up @@ -18,6 +18,8 @@
import time
import urllib.parse
import collections
import shlex
jamesob marked this conversation as resolved.
Show resolved Hide resolved
import sys

from .authproxy import JSONRPCException
from .util import (
Expand Down Expand Up @@ -59,7 +61,13 @@ class TestNode():
To make things easier for the test writer, any unrecognised messages will
be dispatched to the RPC connection."""

def __init__(self, i, datadir, *, rpchost, timewait, bitcoind, bitcoin_cli, mocktime, coverage_dir, extra_conf=None, extra_args=None, use_cli=False):
def __init__(self, i, datadir, *, rpchost, timewait, bitcoind, bitcoin_cli, mocktime, coverage_dir, extra_conf=None, extra_args=None, use_cli=False, start_perf=False):
"""
Kwargs:
start_perf (bool): If True, begin profiling the node with `perf` as soon as
the node starts.
"""

self.index = i
self.datadir = datadir
self.stdout_dir = os.path.join(self.datadir, "stdout")
Expand Down Expand Up @@ -87,6 +95,7 @@ def __init__(self, i, datadir, *, rpchost, timewait, bitcoind, bitcoin_cli, mock

self.cli = TestNodeCLI(bitcoin_cli, self.datadir)
self.use_cli = use_cli
self.start_perf = start_perf

self.running = False
self.process = None
Expand All @@ -95,6 +104,8 @@ def __init__(self, i, datadir, *, rpchost, timewait, bitcoind, bitcoin_cli, mock
self.url = None
self.log = logging.getLogger('TestFramework.node%d' % i)
self.cleanup_on_exit = True # Whether to kill the node when this object goes away
# Cache perf subprocesses here by their data output filename.
self.perf_subprocesses = {}
jamesob marked this conversation as resolved.
Show resolved Hide resolved

self.p2ps = []

Expand Down Expand Up @@ -186,6 +197,9 @@ def start(self, extra_args=None, *, stdout=None, stderr=None, **kwargs):
self.running = True
self.log.debug("bitcoind started, waiting for RPC to come up")

if self.start_perf:
self._start_perf()

def wait_for_rpc_connection(self):
"""Sets up an RPC connection to the bitcoind process. Returns False if unable to connect."""
# Poll at a rate of four times per second
Expand Down Expand Up @@ -238,6 +252,10 @@ def stop_node(self, expected_stderr='', wait=0):
except http.client.CannotSendRequest:
self.log.exception("Unable to stop node.")

# If there are any running perf processes, stop them.
for profile_name in tuple(self.perf_subprocesses.keys()):
self._stop_perf(profile_name)

# Check that stderr is as expected
self.stderr.seek(0)
stderr = self.stderr.read().decode('utf-8').strip()
Expand Down Expand Up @@ -317,6 +335,84 @@ def assert_memory_usage_stable(self, *, increase_allowed=0.03):
increase_allowed * 100, before_memory_usage, after_memory_usage,
perc_increase_memory_usage * 100))

@contextlib.contextmanager
def profile_with_perf(self, profile_name):
"""
Context manager that allows easy profiling of node activity using `perf`.

See `test/functional/README.md` for details on perf usage.

Args:
profile_name (str): This string will be appended to the
profile data filename generated by perf.
"""
subp = self._start_perf(profile_name)

yield

if subp:
self._stop_perf(profile_name)

def _start_perf(self, profile_name=None):
"""Start a perf process to profile this node.

Returns the subprocess running perf."""
subp = None

def test_success(cmd):
return subprocess.call(
# shell=True required for pipe use below
cmd, shell=True,
stderr=subprocess.DEVNULL, stdout=subprocess.DEVNULL) == 0

if not sys.platform.startswith('linux'):
self.log.warning("Can't profile with perf; only availabe on Linux platforms")
return None

if not test_success('which perf'):
self.log.warning("Can't profile with perf; must install perf-tools")
return None

if not test_success('readelf -S {} | grep .debug_str'.format(shlex.quote(self.binary))):
jamesob marked this conversation as resolved.
Show resolved Hide resolved
self.log.warning(
"perf output won't be very useful without debug symbols compiled into bitcoind")

output_path = tempfile.NamedTemporaryFile(
dir=self.datadir,
prefix="{}.perf.data.".format(profile_name or 'test'),
delete=False,
).name

cmd = [
'perf', 'record',
'-g', # Record the callgraph.
'--call-graph', 'dwarf', # Compatibility for gcc's --fomit-frame-pointer.
'-F', '101', # Sampling frequency in Hz.
'-p', str(self.process.pid),
'-o', output_path,
]
subp = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
self.perf_subprocesses[profile_name] = subp

return subp

def _stop_perf(self, profile_name):
"""Stop (and pop) a perf subprocess."""
subp = self.perf_subprocesses.pop(profile_name)
output_path = subp.args[subp.args.index('-o') + 1]
jamesob marked this conversation as resolved.
Show resolved Hide resolved

subp.terminate()
subp.wait(timeout=10)

stderr = subp.stderr.read().decode()
if 'Consider tweaking /proc/sys/kernel/perf_event_paranoid' in stderr:
self.log.warning(
"perf couldn't collect data! Try "
jamesob marked this conversation as resolved.
Show resolved Hide resolved
"'sudo sysctl -w kernel.perf_event_paranoid=-1'")
else:
report_cmd = "perf report -i {}".format(output_path)
self.log.info("See perf output by running '{}'".format(report_cmd))

def assert_start_raises_init_error(self, extra_args=None, expected_msg=None, match=ErrorMatch.FULL_TEXT, *args, **kwargs):
"""Attempt to start the node and expect it to raise an error.

Expand Down
2 changes: 1 addition & 1 deletion test/lint/lint-python-dead-code.sh
Expand Up @@ -15,5 +15,5 @@ fi

vulture \
--min-confidence 60 \
--ignore-names "argtypes,connection_lost,connection_made,converter,data_received,daemon,errcheck,get_ecdh_key,get_privkey,is_compressed,is_fullyvalid,msg_generic,on_*,optionxform,restype,set_privkey" \
--ignore-names "argtypes,connection_lost,connection_made,converter,data_received,daemon,errcheck,get_ecdh_key,get_privkey,is_compressed,is_fullyvalid,msg_generic,on_*,optionxform,restype,set_privkey,profile_with_perf" \
$(git ls-files -- "*.py" ":(exclude)contrib/" ":(exclude)test/functional/data/invalid_txs.py")