Skip to content

ArcticDB Performance Profiling

IvoDD edited this page Nov 30, 2023 · 3 revisions

This page aims to describe different methods of profiling the performance of ArcticDB.

ASV Benchmarks

Using the existing benchmarks and adding new ones is a great new to track the performance of various parts of ArcticDB over time. For more information regarding ASV Benchmarking, please refer to this page in the Wiki.

Py-Spy

Py-spy is a great tool to profile python code/libraries that relies on C++ extensions. To install it, run:

pip install py-spy

To access the full suite of options, it is recommended to use it on Linux and to create a speedscope file as output. You should also compile the library in debug for most representative profiling results.

To profile the code, you will need to profile by pid, like so:

python test_script.py & # this will spawn python in a new process and will output the pid of that process

py-spy record --format speedscope \ # set the format to speedscope
              --output test.json \ # set the name of the output file with the profiling info
              --native \ # set the native flag to get the profiling date from the native aka cpp function calls
              --idle \ # set the idle flag to capture data for idle threads
              --pid 1111 \ # set the pid of the python process that you want to profile

To examine the output, simply pass the outputed json file to speedscope (BEWARE: Don't upload profiles in the website if they're using sensitive data. You can run locally instead, see here). You can use this example json to play around with speedscope. You can view the output in different ways to get a different picture of the performance:

  • Time Order - shows the function calls over time
  • Left Heavy - group similar call stacks together
  • Sandwich - show the most time consuming functions

Enable performance tracing

We have functionality that samples various parts of the CPP portion of ArcticDB. Those samples produce logs to stdout, which detail how much time various functions took.

By default, this functionality is disabled and requires a recompilation to be enabled. To enable it, add add_compile_definitions(ARCTICDB_LOG_PERFORMANCE) to cpp/CMakeLists.txt and recompile. On the following run, logs from the samples will be printed to stdout.

To add new samples, refer to how ARCTICDB_SAMPLE and ARCTICDB_SUBSAMPLE are used through the code.

Perf Flamegraph

This section details how to profile the code in the linux dev container (IDE agnostic).

  • Enable the linux-release - build-release CMake profile, and build the arcticdb_ext target with this profile.
  • Ensure the symlink in the python directory points to the release .so file ln -s ../cpp/out/linux-release-build/arcticdb/arcticdb_ext.cpython-36m-x86_64-linux-gnu.so
  • Clone the FlameGraph project into /opt
  • From the python directory, run perf record -g --call-graph="dwarf" python <path to Python script to profile> && perf script | /opt/FlameGraph/[stackcollapse-perf.pl](http://stackcollapse-perf.pl/) > /tmp/out.perf-folded && /opt/FlameGraph/[flamegraph.pl](http://flamegraph.pl/) /tmp/out.perf-folded > flamegraph-$(date -Iseconds).svg
  • Open the resulting SVG file

Notes:

  • --call-graph="fp" gives less detail, but does result in much smaller (GB → MB) perf.data files.
  • The x-axis in the resulting SVG is not time. When profiling time spent in either the CPU or IO thread pools, each thread in the pool will have its own section.