Generalize stackcount to support uprobes and tracepoints #580

goldshtn · 2016-06-26T13:03:42Z

Currently, stackcount supports only kernel functions. It is a fairly low-hanging fruit to add user functions by using uprobes instead of kprobes. It should also be quite easy to add tracepoint support (using the current hacky approach or waiting for the 4.7 native support) and USDT probes as well.

I think in a lot of cases, stackcount on one of these data sources can replace more specialised tools. For example, if I want to know where my threads are blocked for mutexes, I'd use the pthread_mutex_lock USDT probe (or a uprobe on that function). If I want to know where threads are issuing a lot of block I/Os, I'd use the block:block_rq_issue tracepoint. And so on.

There's the more-or-less uniform syntax currently used by argdist and trace that I propose to use here as well. Something like this:

# stackcount submit_bio
# stackcount -p 285 p:c:malloc
# stackcount -p 285 u:/opt/node/node:gc__start
# stackcount t:sched:sched_switch

What do you think?

/cc @brendangregg

The text was updated successfully, but these errors were encountered:

brendangregg · 2016-06-27T17:52:35Z

Looks good to me.

At some point we'll have the BPF_PROG_TYPE_TRACEPOINT support in bcc, so tracepoints should get easier.

goldshtn · 2016-07-10T08:18:44Z

Now that we really have tracepoint support, it's actually harder than it seems because we don't have the struct pt_regs * ctx in tracepoints -- which means we can't get the stack trace unless we use the "old" approach for attaching to tracepoints. I have a WIP implementation that supports uprobes (in addition to the original implementation for kprobes), I wonder if we should just leave it at that.

By the way, the same considerations apply to funccount and funclatency as well -- user-space support would be great to have there, too. I think a reasonable approach if we only want uprobes + kprobes is adding a [-l LIBRARY] switch, e.g.:

# stackcount -l c -p 1952 malloc

@brendangregg: What do you think?

4ast · 2016-07-11T22:15:32Z

wait a sec. get_stackid() should be working as-is for tracepoints.
the kernel does it trick with pt_regs and suppies them correctly into bpf_get_stackid().
Just need to pass whatever tracepoint 'ctx' into bpf_get_stackid().
Is that suddenly broken or just hypothetical issue?

brendangregg · 2016-07-12T01:20:58Z

Check it out: stacks from tracepoints:

# ./urandomread-test.py 
TIME(s)            COMM             PID    GOTBITS
23459954.705288000 dd               20943  8192
23459954.705396000 dd               20943  8192
23459954.705497999 dd               20943  8192
23459954.705635000 dd               20943  8192
23459954.705745999 dd               20943  8192
^C  urandom_read
  __vfs_read
  vfs_read
  sys_read
  entry_SYSCALL_64_fastpath
    5

Program

#!/usr/bin/python
#
# urandomread-stacks  Example of instrumenting a kernel tracepoint.
#                     For Linux, uses BCC, BPF. Embedded C.
#
# REQUIRES: Linux 4.7+ (BPF_PROG_TYPE_TRACEPOINT support).
#
# Test by running this, then in another shell, run:
#     dd if=/dev/urandom of=/dev/null bs=1k count=5
#
# Copyright 2016 Netflix, Inc.
# Licensed under the Apache License, Version 2.0 (the "License")

from __future__ import print_function
from bcc import BPF
import signal

# load BPF program
b = BPF(text="""
BPF_HASH(counts, int);
BPF_STACK_TRACE(stack_traces, 1024)

TRACEPOINT_PROBE(random, urandom_read) {
    // args is from /sys/kernel/debug/tracing/events/random/urandom_read/format
    int key = stack_traces.get_stackid(args, BPF_F_REUSE_STACKID);
    u64 zero = 0;
    u64 *val = counts.lookup_or_init(&key, &zero);
    (*val)++;

    bpf_trace_printk("%d\\n", args->got_bits);
    return 0;
};
""")

# header
print("%-18s %-16s %-6s %s" % ("TIME(s)", "COMM", "PID", "GOTBITS"))

# print stacks on Ctrl-C
def print_stacks(signal, frame):
    counts = b["counts"]
    stack_traces = b["stack_traces"]
    for k, v in sorted(counts.items(), key=lambda counts: counts[1].value):
        for addr in stack_traces.walk(k.value):
            print("  %s" % b.ksym(addr))
        print("    %d\n" % v.value)
    exit()
signal.signal(signal.SIGINT, print_stacks) 

# format output
while 1:
    try:
        (task, pid, cpu, flags, ts, msg) = b.trace_fields()
    except ValueError:
        continue
    print("%-18.9f %-16s %-6d %s" % (ts, task, pid, msg))

So I just used args as ctx, and it worked.

@goldshtn not sure about library switches (-l), since the precedent is to use prefixes on the probe name directly.

goldshtn · 2016-07-12T06:02:21Z

@4ast: I just assumed it wouldn't work. Very happy to learn that it does. Sorry :)

@brendangregg: OK, sounds reasonable. There are a couple of prerequisites to make it happen, namely need to support uprobe, tracepoint, and USDT attach with regex.

wenbinzeng · 2020-05-20T03:12:06Z

dd if=/dev/urandom of=/dev/null bs=1k count=5

@brendangregg @4ast
I wonder if this feature depends on certain kernel versions? This program worked well on my 5.6.6-300.fc32.x86_64 kernel, but didn't work on CentOS 8 kernel version 4.18.0-147.8.1.el8_1.x86_64, get_stackid returned -14, output looks like:

./urand.py

TIME(s) COMM PID GOTBITS
3565.830200000 b'dd' 12745 b'8192'
3565.830220000 b'dd' 12745 b'stackid: -14'
3565.830242000 b'dd' 12745 b'8192'
3565.830243000 b'dd' 12745 b'stackid: -14'
3565.830262000 b'dd' 12745 b'8192'
3565.830271000 b'dd' 12745 b'stackid: -14'
3565.830291000 b'dd' 12745 b'8192'
3565.830293000 b'dd' 12745 b'stackid: -14'
3565.830313000 b'dd' 12745 b'8192'
3565.830315000 b'dd' 12745 b'stackid: -14'
^CTraceback (most recent call last):
File "./urand.py", line 54, in
(task, pid, cpu, flags, ts, msg) = b.trace_fields()
File "/usr/lib/python3.6/site-packages/bcc/init.py", line 1221, in trace_fields
line = self.trace_readline(nonblocking)
File "/usr/lib/python3.6/site-packages/bcc/init.py", line 1253, in trace_readline
line = trace.readline(1024).rstrip()
File "./urand.py", line 45, in print_stacks
for addr in stack_traces.walk(k.value):
File "/usr/lib/python3.6/site-packages/bcc/table.py", line 896, in walk
return StackTrace.StackWalker(self[self.Key(stack_id)], self.flags, resolve)
File "/usr/lib/python3.6/site-packages/bcc/table.py", line 250, in getitem
raise KeyError
KeyError

goldshtn mentioned this issue Jul 14, 2016

USDT probes #327

Closed

goldshtn mentioned this issue Oct 5, 2016

stackcount: Support uprobes, tracepoints, and USDT #730

Merged

4ast closed this as completed in #730 Oct 5, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize stackcount to support uprobes and tracepoints #580

Generalize stackcount to support uprobes and tracepoints #580

goldshtn commented Jun 26, 2016

brendangregg commented Jun 27, 2016

goldshtn commented Jul 10, 2016

4ast commented Jul 11, 2016

brendangregg commented Jul 12, 2016

goldshtn commented Jul 12, 2016

wenbinzeng commented May 20, 2020

./urand.py

Generalize stackcount to support uprobes and tracepoints #580

Generalize stackcount to support uprobes and tracepoints #580

Comments

goldshtn commented Jun 26, 2016

brendangregg commented Jun 27, 2016

goldshtn commented Jul 10, 2016

4ast commented Jul 11, 2016

brendangregg commented Jul 12, 2016

goldshtn commented Jul 12, 2016

wenbinzeng commented May 20, 2020

./urand.py