Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USDT tools for higher level languages #774

Merged
merged 29 commits into from Dec 19, 2016

Conversation

@goldshtn
Copy link
Collaborator

commented Oct 20, 2016

This is a work-in-progress: please do not merge yet. I'm putting it out there for review and feedback at this time. Some of the tools are not done yet, and none have example and man files at this time.

Inspired by tools from the DTrace toolkit, here are a couple of tools for high-level languages (currently Java, Python, and Ruby) using their USDT probes when available.

The tools I have planned are:

  • ucalls (in this PR) -- summarize method call numbers and total latency
  • ugc (in this PR) -- monitor and trace GC events
  • uflow (planned) -- print method call flow
  • ustat (planned) -- print top-like statistics on method calls, class loads, GC events, and so on

On top of these, profile should already work with languages that can emit /tmp/perf-PID.map files by resolving stack traces, so there's probably no need for a custom uprofile.

I think it should be possible to avoid forking each tool for each language, but rather keep a single set of tools with a language parameter. Examples (some of this should already work at this patchset):

# top 10 slowest Python methods, times in milliseconds
ucalls python 148 -mLT 10

# Java GC events
ugc java 8004

# print flow graph for java.io methods
uflow java 8004 java.io.*

# print stats on method calls, object creation, GC events, etc. for Ruby process 1103
ustat ruby 1103

Any thoughts or feedback on tool naming, options, and so on would be appreciated. I am mostly trying to mimic the per-language tools from the DTrace toolkit, while recognizing that some things are not currently/yet possible (e.g. syscall tracing in addition to method tracing).

/cc @brendangregg

@brendangregg

This comment has been minimized.

Copy link
Member

commented Oct 20, 2016

Some general comments before I get into the tools...

Not a lot was done with USDT in runtimes. Here's the three problems:

  • GC tracing: via USDT works great, but most runtimes already have a GC log, and those logs can have a ton more detail. There may be times when we want a bcc tool to do it, but just remember that it won't be the primary tool many people use.
  • Function tracing: The overhead of int 3 trapping made these fairly prohibitive (@4ast might fix it one day :). In the DTraceToolkit, I did some awesome looking tools with them, and they were useful for debugging, but high overhead for production. I should have warned about overhead more, and we can here in bcc, so people know they are a function-equivalent of running strace!
  • Sun was on the way out: There's more we can do with USDT probes in runtimes, like looking into memory allocation in more detail, but around the time we should have been doing those kinds of things, people were switching to Linux. So I think there's untapped opportunity here, which we'll discover now that USDT can be a thing on Linux, and we can go back to adding more runtime USDT probes.

Java had some interesting ones: like class allocations, thread creations. I think those interest me the most. Sure, I'll take GC scripts and function tracing scripts like existed in the DTraceToolkit, but the latter are more debug-only until we lower overheads.

Another interesting use is mixing USDT probes with other user and kernel events, which is something other analyzers can't do. You can imagine doing this for MySQL USDT probes: casting kernel events in the context of a query. Need to think about opportunities in runtimes too.

As for naming: I'd go with javagc, javacalls, javaclassstat, nodegc, nodecalls, pythoncalls, etc. So we can list java* for Java developers. Findability. Now if the tools are so close, then symlinks or light 2-line shell wrappers to a common multi-tool?

You know what I'd do: I'd start with separate files for each, and consider refactoring later. Think about this: if a Java developer wants to enhance separate Java tools, they can go to town. But if they are all multi-tools, then we'll be telling them: please test your changes with Node.js, Python, etc, etc. I can imagine how that'd go!

@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Oct 21, 2016

Thanks for the feedback! I realize the method tracing tools are going to be for debug use, mostly. Re GC stats, Java sure has a slew of logs, but for other runtimes it can be quite useful. For example, a quick search reveals that Node has a --trace-gc and no one understands its output. In Ruby, you need to add a gem that collects GC statistics. And so on.

Re kernel events in the context of a query: this can be done today with trace. The tool is harder to use, sure, but it's general-purpose. I don't know if we can anticipate exactly which kernel events people would want along with the USDT traces. Or maybe we will be able to, in the future, when BPF + USDT get more traction. What I really would like is syscall support, and as you recall (#748) we don't have that yet. If there's anything we can do to move that forward, I'd love to help.

@brendangregg

This comment has been minimized.

Copy link
Member

commented Oct 21, 2016

Yes, good points about Node.js and Ruby.

There's a lot of languages out there. I've been thinking about it, and maybe it does make sense to begin with the multi-tools (and multiple symlinks/shell scripts to them). Because: A) accommodating Java developers with individual scripts is a nice idea, but what's the chance none contribute and we end up doing the work anyway?, and B) bcc tools are long. With DTrace, it was no big deal to have 4 tools that had 95% common code, when the tools were 20 lines each. With bcc, we have 100-300 line tools, so duplicating that is 10x more work to maintain.

@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Oct 26, 2016

@brendangregg So I've added a couple of tools and changed some of the tools I pushed originally. What we have right now (not 100% done, and no documentation yet, so not mergeable):

  • ugc for tracing GC events (Java, Node, Python, Ruby)
  • ucalls for tracing method calls and/or syscall stats (Java, Python, Ruby, syscalls-only)
  • uthreads for tracing thread start/stop events (Java, pthreads)
  • uobjnew for summarizing object allocations by type (Java, Ruby, malloc)

The last remaining tool I haven't touched yet is uflow, which is going to be super-crazy-high overhead but probably still useful as a light debugger replacement for tracing method flow.

I was also thinking of building systrace -- an strace equivalent that uses BPF. Don't know if that'd be interesting or useful.

@brendangregg

This comment has been minimized.

Copy link
Member

commented Oct 26, 2016

May as well do a quick systrace. I did dtruss once, just to show DTrace could do syscalls. Its only feature over Solaris truss was being able to match multiple process names with -n, instead of one PID only. Next thing dtruss is the only syscall tracer for OS X.

I say quick, as I suspect strace has a lot of translations in that would be redundant to copy and hard to maintain. So I wouldn't invest too much time in the experiment.

What would be also useful (in addition) would be a syscount, like my https://github.com/brendangregg/perf-tools/blob/master/examples/syscount_example.txt . Yes, that's really a more featured version of strace -c. I'd be inclined not to add it to systrace, since that would be mixing a per-event output tool with a summary tool (I know strace has a precedent for that, but I'm not sure it's a good precedent). I like the argdist / trace split.

Your choice of USDT tools makes sense. I still don't know whether these should all become symlinks (java_, node_) or just keep them as u*, or both.

I took a quick look at the code -- maybe because they are new you just haven't done this yet, but I'd have put big comments above each language block. Like:

    probes.append(Probe("gc__start", "gc__done",
                        begin_save, end_save, formatter))
#
# Ruby
#
elif args.language == "ruby":
    # Ruby GC probes do not have any additional information available.
    probes.append(Probe("gc__mark__begin", "gc__mark__end",
                        "", "", lambda _: "GC mark stage"))
    probes.append(Probe("gc__sweep__begin", "gc__sweep__end",
                        "", "", lambda _: "GC sweep stage"))
#
# Node.js
#
elif args.language == "node":
    end_save = """
@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Oct 26, 2016

syscount is pretty much there with ucalls -S, except I built the tool to require a pid (this can be changed specifically for -S); or with funccount -r '^[Ss]y[Ss]_*'. I guess it won't hurt to have a separate tool with some better grouping/filtering/sorting options like your syscount from perf-tools.

Re systrace, I didn't mean to provide the same level of detail (with the fd parsing, string extraction and so on). Just a basic trace of each syscall with 6 arguments. More advanced parsing would be possible with trace.

@goldshtn goldshtn force-pushed the goldshtn:usdt-tools branch 2 times, most recently from c681cdc to 1207f28 Oct 26, 2016

@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Oct 27, 2016

Just forced-push after rebasing. Added a new tool, ustat (which I forgot about) that prints high-level stats on GCs, method calls, object allocations, and so on -- in all high-level processes it can put its hands on.

Example output:

# ustat -C 5 3
Tracing... Output every 5 secs. Hit Ctrl-C to end                               

 06:44:43 loadavg: 0.07 0.23 0.15 5/463 19945                                    

 PID    CMDLINE          METHOD/s   GC/s   OBJNEW/s   CLOAD/s  EXCP/s   THREAD/s 
 16109  /home/vagrant/no 0          5      0          0        0        0        
 17595  irb              0          0      0          0        0        0
 17815  ./python         17995      0      0          0        0        0       
 19928  java -XX:+Extend 0          0      0          0        0        0       

 06:44:49 loadavg: 0.15 0.24 0.15 1/464 19947                                    

 PID    CMDLINE          METHOD/s   GC/s   OBJNEW/s   CLOAD/s  EXCP/s   THREAD/s 
 16109  /home/vagrant/no 0          0      0          0        0        0        
 17595  irb              130        1      109        0        0        0
 17815  ./python         0          0      0          0        0        0       
 19928  java -XX:+Extend 0          0      0          0        0        0       

 06:44:54 loadavg: 0.28 0.27 0.16 3/465 19948                                    

 PID    CMDLINE          METHOD/s   GC/s   OBJNEW/s   CLOAD/s  EXCP/s   THREAD/s 
 16109  /home/vagrant/no 0          0      0          0        0        0        
 17595  irb              0          0      0          0        0        0
 17815  ./python         0          0      0          0        0        0       
 19928  java -XX:+Extend 762207     0      0          0        0        0       

Also, added uflow for flow tracing.

@goldshtn goldshtn force-pushed the goldshtn:usdt-tools branch from b5fc267 to 382ec80 Oct 29, 2016

@goldshtn goldshtn force-pushed the goldshtn:usdt-tools branch from 382ec80 to 542888c Nov 7, 2016

@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 7, 2016

Added man pages and example files. This is ready for review and merging. /cc @brendangregg

@goldshtn goldshtn changed the title WIP [do not merge]: USDT tools for higher level languages USDT tools for higher level languages Nov 7, 2016

@dark

This comment has been minimized.

Copy link
Member

commented Nov 8, 2016

@goldshtn I tried ustat because it looks very interesting. Thanks!
I have noticed one thing I'd like to report, though: it looks like ustat leaks file descriptors for bpf-map, and eventually errors out with error: could not open bpf map: Too many open files.
By following along with lsof I noticed a leak of around 20-30 bpf-map FDs per second.

Maybe it's a problem of my setups? My two machines have oldish kernels (4.4.26 and 4.7.4) and no packaged configured with --enable-dtrace, so there is no activity to report actually. I will try with dtrace-enabled software sometime in the future.

@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 9, 2016

@dark Thanks, it's leaking indeed on my 4.9-rc as well. I'll track these fds down. I'm explicitly cleaning up the BPF object, but something must be sticking around. I believe this is the only tool we have that reloads the BPF program with new maps instead of just initializing everything upfront, which is probably why we have only now run into it.

@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 9, 2016

@dark Should be fixed by #808. When it gets merged, I'll rebase.

@goldshtn goldshtn force-pushed the goldshtn:usdt-tools branch from 542888c to 397ea4c Nov 9, 2016

@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 9, 2016

Rebased. We should get this merged at some point :)

@brendangregg

This comment has been minimized.

Copy link
Member

commented Nov 18, 2016

I'm back... I like the tools. They seem to work. :)

uobjnew.py is missing a PID filter, at least for the "c" match (the following python program is idle):

# ./uobjnew.py c `pgrep -n python` -v
#include <uapi/linux/ptrace.h>


#include <linux/ptrace.h>

struct key_t {
#if 1
    u64 size;
#else
    char name[50];
#endif
};

struct val_t {
    u64 total_size;
    u64 num_allocs;
};

BPF_HASH(allocs, struct key_t, struct val_t);

int alloc_entry(struct pt_regs *ctx, size_t size) {
    struct key_t key = {};
    struct val_t *valp, zero = {};
    key.size = size;
    valp = allocs.lookup_or_init(&key, &zero);
    valp->total_size += size;
    valp->num_allocs += 1;
    return 0;
}

Tracing allocations in process 26230 (language: c)... Ctrl-C to quit.
^C
TYPE                           # ALLOCS      # BYTES
block size 0                         32            0
block size 50                         8          400
block size 6                         76          456
block size 57                         8          456
[...]
block size 224                    15411      3452064
block size 552                     7162      3953424
block size 32816                    242      7941472
block size 4097                    4992     20452224

uobjnew examples have a "-l", but that should be omitted (will need to update the tool and example file) (update: read ahead as I have a different suggestion about -l):

examples:
    ./uobjnew -l java 145         # summarize Java allocations in process 145
    ./uobjnew -l c 2020 1         # grab malloc() sizes and print every second
    ./uobjnew -l ruby 6712 -C 10  # top 10 Ruby types by number of allocations
    ./uobjnew -l ruby 6712 -S 10  # top 10 Ruby types by total size

In the future, it would be nice if these tools could pick a default language if one wasn't specified. Eg, if the process name is python or java, then go with that. There will be times you'll want to override it: examining C calls for a Java program, for example. But most of the time I'm using these tools like this:

# ./ucalls.py -l python `pgrep -n python`

ustat.py, ucalls.py, and uthreads.py want the -l. Should that be dropped for consistency? ... It might be a bit tidier, if these tools could pick the java/python/etc default, to then have the -l as the optional override. In which case, all tools try to pick a default, and all tools have an optional -l.

uflow.py:

  -C CLAZZ, --class CLAZZ

BTW, uflow.py works well, at least for something simple:

# ./uflow.py python `pgrep -n python`
Tracing method calls in python process 26230... Ctrl-C to quit.
CPU PID    TID    TIME(us) METHOD
6   26230  26230  1.846    -> <stdin>.<module>                     
6   26230  26230  1.846      -> <stdin>.func_a                     
6   26230  26230  2.847        -> <stdin>.func_b                   
6   26230  26230  3.848          -> <stdin>.func_c                 
6   26230  26230  4.849          <- <stdin>.func_c                 
6   26230  26230  4.849        <- <stdin>.func_b                   
6   26230  26230  4.849      <- <stdin>.func_a                     
6   26230  26230  4.849    <- <stdin>.<module> 

Those PID/TID columns should be handy for filtering multi-threaded apps.

@4ast

This comment has been minimized.

Copy link
Member

commented Nov 29, 2016

@goldshtn are you planning one more update before it can be merged?

@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 29, 2016

Yes - will fix the immediate issues and rebase, and leave non-pressing stuff for a separate PR. Hope to wrap this up this week.

goldshtn added some commits Oct 17, 2016

ucalls: Summarize method calls with USDT
`ucalls` is a new tool that uses the USDT probes in high-
level languages (Java, Python, Ruby) to summarize the
number of method calls and optionally their latency. This
is similar to `funccount` and `funclatency` but for methods
in Java, Python, Ruby, and other languages that might have
USDT probes for method entry and exit.

`ucalls` uses the `method__entry` and `method__return`
probes in Java and Ruby, and the `function__entry` and
`function__return` probes in Python. The Python probes are
not as accurate because they do not report the class name,
but are still useful.

The optional `-L` switch indicates whether latency numbers
are required. By default, only the number of calls to each
method is printed. The `-T` switch filters out the top
methods (by number of calls or total latency).
ugc: Monitor GC events in high-level languages
`ugc` monitors GC start and end events in Java, Python,
and Ruby, printing out a message when a GC cycle completes
along with the duration of the GC and an optional
additional description that depends on the target language.

NOTE: This WIP does not have any interesting additional
descriptions yet.

Usage example:

```
ugc java 1448	# monitor Java GCs in process 1448
```
ugc: Add description messages for GC events
For runtimes that make it available, we now extract GC
event information such as heap sizes and print this
information in addition to the GC duration and timestamp.
tools: Script skeletons
Skeletons for method flow, object allocations, general
event statistics, and thread statistics.
ugc: Node support
Add Node support for gc__start and gc__done probes, which
includes GC type flag parsing.
ucalls: Trace syscalls
Add `-S` switch for tracing syscalls with or without high-
level language method calls. When `-S` is provided, attach
to all `sys_` and `SyS_` entry points in the kernel and
trace the number of calls and execution latency (if `-L`
was also specified).

This is now also a useful general-purpose tool for syscall
count/latency summarizing when a language is not specified.
uthreads: Trace thread creation/destruction events
`uthreads` attaches to thread start and stop USDT probes
and traces out thread creation and destruction events.
Currently, only Java is supported.
uthreads: Add pthread support
Trace `pthread_start` calls and get the symbol for the
newly-started thread for printing. Also allow no-Java runs
with no language specified for pthread-only tracing.

goldshtn added some commits Oct 25, 2016

uobjnew: Summarize object allocations
Use various USDT probes in Java and Ruby, and `malloc()`
uprobe in C, to summarize object allocations by type or
by size. Print statistics at the end or at intervals, with
optional trimming of top types by number of instances
allocated or number of bytes.
python: Expose active probes from USDT object
Allow callers to obtain the list of active (enabled) USDT
probes from the USDT object, if they need to enable the
uprobes by hand without using the bcc attach support. This
is required when the same uprobe needs to be enabled in
more than one process.
ustat: Activity stats from high-level languages
This tool gathers statistics (by pid) on method calls,
class loads, garbage collections, thread creations, object
allocations, and exceptions raised and prints them in a
top-like style. Supported languages currently include
Java, Python, Ruby, and Node. Not all languages provide
the necessary probes for all event types.

Usage examples:

```
ustat -C
ustat -l java 1
ustat -l ruby -S objnew
```
uflow: Trace method execution flow
This tool traces method executions and prints out messages
with a nesting depth indicator that allows relatively easy
visualization of program flow. To reduce overhead, `-C` and
`-M` switches take class and method prefixes to filter the
output by. The filtering takes place in the BPF program, so
the unnecessary data never ends up in user space.

Usage examples:

```
uflow java 185
uflow -M index python 180
uflow -C 'IRB::Irb' ruby 10041
```
ustat: Replace \0 with spaces in cmdline
The /proc/PID/cmdline file has \0 for spaces in the command
line, and there may be trailing \0 characters as well.
Replace them all with spaces.
ustat: Graceful error handling
If the process in question doesn't have the USDT probes we
expect, handle the failure gracefully. After all, it could
be a binary compiled without USDT support, the pid could be
recycled to some other application, or -- the process name
could match, but by accident, and really belong to some
other runtime.
uobjnew: Attach uprobe only to the requested process
When C allocations are traced, attach the `malloc`
uprobe only to the requested process -- otherwise, we
get allocation information from the entire system.

@goldshtn goldshtn force-pushed the goldshtn:usdt-tools branch from 397ea4c to d4a4248 Dec 19, 2016

@goldshtn

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 19, 2016

@4ast @brendangregg

Rebased and fixed the issues found above -- I think this is ready for merging. I will open a separate issue for automatically detecting the language, I don't know if we always want it. In any case I think it belongs in a separate PR.

Re the -l inconsistency in some tools: the reason the -l is required in ustat, ucalls, and uthreads is that it is an optional argument. If it had been positional and optional, it couldn't have been first in the list, and I think that would be kind of ugly, e.g.: ustat 4482 1 java reads a bit weird.

Let's get this merged (I also have a blog post I need to write about these :)) and then iterate?

@ceeaspb

This comment has been minimized.

Copy link
Contributor

commented Dec 19, 2016

great stuff
@goldshtn likely a separate ticket . I tested with a jvm where usdts are not built in (eg. Oracle). An error message is provided about not specifying a pid. I did specify a pid so this could be confusing. In this set of scripts it would be useful to indicate to the user that the usdts (uprobes) were not found in the process likely because the binary was not built with it enabled (as it's mentioned in the man pages etc)

@4ast 4ast merged commit b2fe72f into iovisor:master Dec 19, 2016

5 checks passed

default Build finished.
Details
jenkins/fc23
Details
jenkins/fc24
Details
jenkins/ubuntu1404
Details
jenkins/ubuntu1604
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.