Skip to content

Statistical profiling with Linux perf on HSDK board

Alexey Brodkin edited this page Jun 13, 2019 · 1 revision

Unfortunately ARC HS38 ASIC used in HSDK board was configured so that hardware performance counters are not generating interrupts on hitting some preset limit. That in theory makes it impossible to use perf statistical profiling:

# perf record -a -e cpu-cycles sleep 5
Error:
cpu-cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'

But instead of real hardware event we may use any of available software events for the same purpose. It might be not that precise but given long enough execution statistical data should be still relevant.

So that's what we may use for statistical profiling of executed "instructions":

# perf record -a -e cpu-clock sleep 5
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.560 MB perf.data (11996 samples) ]
# perf report
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 11K of event 'cpu-clock'
# Event count (approx.): 2999000000
#
# Overhead  Command  Shared Object        Symbol
# ........  .......  ...................  ...........................
#
    99.94%  swapper  [kernel.kallsyms]    [k] arch_cpu_idle
     0.01%  perf     [kernel.kallsyms]    [k] __get_user_pages.part.7
     0.01%  sleep    [kernel.kallsyms]    [k] filemap_map_pages
     0.01%  sleep    [kernel.kallsyms]    [k] memcpy
     0.01%  sleep    [kernel.kallsyms]    [k] perf_event_mmap
     0.01%  sleep    [kernel.kallsyms]    [k] wp_page_copy
     0.01%  sleep    ld-uClibc-1.0.31.so  [.] 0x000012de
     0.01%  swapper  [kernel.kallsyms]    [k] _raw_spin_unlock_irq

Note with missing IRQ support for performance counters it's still impossible to profile based on events such as cache hits/misses etc, so corresponding hot-points are still out of the question but as we see at least some basic performance bottle-necks might be found (i.e. where do we spend too much time).

Clone this wiki locally