Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

argdist, trace, and tplist support for USDT probes #451

Merged
merged 1 commit into from
Mar 29, 2016

Conversation

goldshtn
Copy link
Collaborator

These tools now support USDT probes with the 'u:provider:probe' syntax.
Probes in a library or process can be listed with 'tplist -l LIB' or 'tplist -p PID'.
Probe arguments are also parsed and available in both argdist and trace as arg1,
arg2, etc., regardless of the probe attach location.

The same USDT probe can be used at multiple locations, which means the attach infra-
structure must probe all these locations. argdist and trace register thunk probes
at each location, which call a central probe function (which is static inline) with
the location id (__loc_id). The central probe function checks the location id to
determine how the arguments should be retrieved -- this is location-dependent.

Finally, some USDT probes must be enabled first by writing a value to a memory
location (this is called a "semaphore"). This value is per-process, so we require a
process id for this kind of probes. WARNING: This whole business of writing a value
to another process' memory is inherently unsafe, and needs more thorough testing,
especially when the semaphore is in a shared library whose load address can depend
on the pid.

Along with trace and argdist tool support, this commit also introduces new classes
in the bcc module: ProcStat handles pid-wrap detection, whereas USDTReader,
USDTProbe, USDTProbeLocation, and USDTArgument are the shared USDT-related
infrastructure that enables enumeration, attachment, and argument retrieval for
USDT probes.

These tools now support USDT probes with the 'u:provider:probe' syntax.
Probes in a library or process can be listed with 'tplist -l LIB' or 'tplist -p PID'.
Probe arguments are also parsed and available in both argdist and trace as arg1,
arg2, etc., regardless of the probe attach location.

The same USDT probe can be used at multiple locations, which means the attach infra-
structure must probe all these locations. argdist and trace register thunk probes
at each location, which call a central probe function (which is static inline) with
the location id (__loc_id). The central probe function checks the location id to
determine how the arguments should be retrieved -- this is location-dependent.

Finally, some USDT probes must be enabled first by writing a value to a memory
location (this is called a "semaphore"). This value is per-process, so we require a
process id for this kind of probes.

Along with trace and argdist tool support, this commit also introduces new classes
in the bcc module: ProcStat handles pid-wrap detection, whereas USDTReader,
USDTProbe, USDTProbeLocation, and USDTArgument are the shared USDT-related
infrastructure that enables enumeration, attachment, and argument retrieval for
USDT probes.
@goldshtn
Copy link
Collaborator Author

Updated this PR with somewhat-tested support for semaphores in shared objects. I used /proc/PID/maps to find the .so's load address, and then added the semaphore address (from the stapsdt note) to the load address. It seems to be working 👏

# ./trace.py 'u:/home/vagrant/libusdt_shared.so:loop_iter "i=%d msg=%s", arg1, arg2' -p $(pidof usdt_main)
TIME     PID    COMM         FUNC             -
06:30:24 26059  usdt_main    loop_iter        i=42 msg=All right
06:30:24 26059  usdt_main    loop_iter        i=0 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=1 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=2 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=3 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=4 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=5 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=6 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=7 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=8 msg=Hello!
06:30:24 26059  usdt_main    loop_iter        i=9 msg=Hello!
^C

for arg in self.raw_args.split():
self._parse_arg(arg.strip())

def _parse_arg(self, arg):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do you get these text args to feed into this func? are they embedded as strings in notes? 'readelf -n' | re for stap, right? notes section is preserved in binaries? Would be good to describe how the whole thing works.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's coming from readelf (see below in USDTReader). By "describe" do you mean in code comments or a separate documentation file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like 'usdt prerequisites' or user focused doc is needed. Or faq that describes what should be in the binary for this mechanism to work. May be including 'readelf -n|grep magic', so that users can do quick sanity test that stap style usdts are there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I'll work on docs and submit a separate PR.

@4ast 4ast merged commit 354ee0c into iovisor:master Mar 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants