argdist, trace, and tplist support for USDT probes #451

goldshtn · 2016-03-28T12:38:32Z

These tools now support USDT probes with the 'u:provider:probe' syntax.
Probes in a library or process can be listed with 'tplist -l LIB' or 'tplist -p PID'.
Probe arguments are also parsed and available in both argdist and trace as arg1,
arg2, etc., regardless of the probe attach location.

The same USDT probe can be used at multiple locations, which means the attach infra-
structure must probe all these locations. argdist and trace register thunk probes
at each location, which call a central probe function (which is static inline) with
the location id (__loc_id). The central probe function checks the location id to
determine how the arguments should be retrieved -- this is location-dependent.

Finally, some USDT probes must be enabled first by writing a value to a memory
location (this is called a "semaphore"). This value is per-process, so we require a
process id for this kind of probes. WARNING: This whole business of writing a value
to another process' memory is inherently unsafe, and needs more thorough testing,
especially when the semaphore is in a shared library whose load address can depend
on the pid.

Along with trace and argdist tool support, this commit also introduces new classes
in the bcc module: ProcStat handles pid-wrap detection, whereas USDTReader,
USDTProbe, USDTProbeLocation, and USDTArgument are the shared USDT-related
infrastructure that enables enumeration, attachment, and argument retrieval for
USDT probes.

These tools now support USDT probes with the 'u:provider:probe' syntax. Probes in a library or process can be listed with 'tplist -l LIB' or 'tplist -p PID'. Probe arguments are also parsed and available in both argdist and trace as arg1, arg2, etc., regardless of the probe attach location. The same USDT probe can be used at multiple locations, which means the attach infra- structure must probe all these locations. argdist and trace register thunk probes at each location, which call a central probe function (which is static inline) with the location id (__loc_id). The central probe function checks the location id to determine how the arguments should be retrieved -- this is location-dependent. Finally, some USDT probes must be enabled first by writing a value to a memory location (this is called a "semaphore"). This value is per-process, so we require a process id for this kind of probes. Along with trace and argdist tool support, this commit also introduces new classes in the bcc module: ProcStat handles pid-wrap detection, whereas USDTReader, USDTProbe, USDTProbeLocation, and USDTArgument are the shared USDT-related infrastructure that enables enumeration, attachment, and argument retrieval for USDT probes.

goldshtn · 2016-03-28T13:31:16Z

Updated this PR with somewhat-tested support for semaphores in shared objects. I used /proc/PID/maps to find the .so's load address, and then added the semaphore address (from the stapsdt note) to the load address. It seems to be working 👏

# ./trace.py 'u:/home/vagrant/libusdt_shared.so:loop_iter "i=%d msg=%s", arg1, arg2' -p $(pidof usdt_main)
TIME     PID    COMM         FUNC             -
06:30:24 26059  usdt_main    loop_iter        i=42 msg=All right
06:30:24 26059  usdt_main    loop_iter        i=0 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=1 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=2 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=3 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=4 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=5 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=6 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=7 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=8 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=9 msg=Hello!
^C

4ast · 2016-03-29T01:59:28Z

src/python/bcc/usdt.py

+                for arg in self.raw_args.split():
+                        self._parse_arg(arg.strip())
+
+        def _parse_arg(self, arg):


where do you get these text args to feed into this func? are they embedded as strings in notes? 'readelf -n' | re for stap, right? notes section is preserved in binaries? Would be good to describe how the whole thing works.

Yes, it's coming from readelf (see below in USDTReader). By "describe" do you mean in code comments or a separate documentation file?

Something like 'usdt prerequisites' or user focused doc is needed. Or faq that describes what should be in the binary for this mechanism to work. May be including 'readelf -n|grep magic', so that users can do quick sanity test that stap style usdts are there.

I agree. I'll work on docs and submit a separate PR.

goldshtn mentioned this pull request Mar 28, 2016

USDT probes #327

Closed

goldshtn force-pushed the usdt branch from 3287dc1 to 7ba26d3 Compare March 28, 2016 13:27

goldshtn force-pushed the usdt branch from 7ba26d3 to 3e39a08 Compare March 28, 2016 13:30

4ast reviewed Mar 29, 2016
View reviewed changes

4ast merged commit 354ee0c into iovisor:master Mar 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

argdist, trace, and tplist support for USDT probes #451

argdist, trace, and tplist support for USDT probes #451

goldshtn commented Mar 28, 2016

goldshtn commented Mar 28, 2016

4ast Mar 29, 2016

goldshtn Mar 29, 2016

4ast Mar 29, 2016

goldshtn Mar 29, 2016

argdist, trace, and tplist support for USDT probes #451

argdist, trace, and tplist support for USDT probes #451

Conversation

goldshtn commented Mar 28, 2016

goldshtn commented Mar 28, 2016

4ast Mar 29, 2016

Choose a reason for hiding this comment

goldshtn Mar 29, 2016

Choose a reason for hiding this comment

4ast Mar 29, 2016

Choose a reason for hiding this comment

goldshtn Mar 29, 2016

Choose a reason for hiding this comment