In addition efficient tracing of methods execution, as of recently Nanoscope supports gathering additional metrics about executing thread (and process). In order to enable collection of these additional metrics, Nanoscope ROM must be built from sources (see here for instructions).
Additional metrics are collected via sampling mechanism described below. Currently the following metrics are being collected:
- CPU utilization for a given thread
- number of major and minor page faults for a given thread
- number of context switches for a given thread
- memory usage (in terms of number of bytes and objects allocated) for a given process
- memory usage (in terms of bytes allocated) for a given thread
Design and implementation of Nanoscope's sampling framework has been inspired by the profilers implemented for ``regular'' (server or desktop) Java applications, such as async-profiler or honest-profiler. The main idea is to use a system call to schedule a periodic signal generation by the OS that will be delivered to a given process, or in our case, a given thread.
Currently, Nanoscope supports two sampling mode:
perf_timer mode - uses
perf_event_opensystem call to generate sampling signals
cpu_timer mode - uses
timer_settimesystem call to generate sampling signals
The perf_timer mode suffers from an apparent risk of a thread receiving a signal being interrupted during another system call which could result in an application crash, but we have never experienced this situation in practice. On the other hand, the cpu_timer mode is supposed to only interrupt a thread in user-level code, increased safety affects fidelity of signal delivery which is significantly lower than that of perf_timer mode - in some cases, with the level of observed thread's activity being low, trace may not be generated in cpu_timer mode for a few seconds.
Sampling only works on a real device and not on simulators. For some of the devices, may need to change the kernel.perf_event_paranoid setting (this has to be done each time the device is rebooted):
adb shell "echo -1 >/proc/sys/kernel/perf_event_paranoid"
Collection of additional metrics can enabled in the following way.
adb shell setprop dev.nanoscope com.example:data.txt:perf_timer starts collection with sampling in perf_timer mode
adb shell setprop dev.nanoscope com.example:data.txt:cpu_timer starts collection with sampling in cpu_timer mode
adb shell setprop dev.nanoscope com.example:data.txt starts collection without sampling
Default sampling interval is 1ms. In perf_timer sampling is based on wall clock time and in cpu_timer sampling is based on cpu time.
In additional to
data.ext file, we will also be generating two additional files:
data.ext.timer which contains all sample data and
data.ext.state which contains state transition trace. Those two files will be consumed by the new version Nanoscope Visualizer. If sampling is not enabled, those two files will be empty.
data.ext.timer is organized in the following format:
Each row represents a sample, each sample is in the format of:
wall clock ts, cpu ts, # of major page faults, # of minor page faults, context switches, process memory usage (bytes), process memory usage (objects), memory ever allocated by traced thread (bytes), memory ever freed by traced thread