Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize stackcount to support uprobes and tracepoints #580

Closed
goldshtn opened this issue Jun 26, 2016 · 6 comments · Fixed by #730
Closed

Generalize stackcount to support uprobes and tracepoints #580

goldshtn opened this issue Jun 26, 2016 · 6 comments · Fixed by #730

Comments

@goldshtn
Copy link
Collaborator

Currently, stackcount supports only kernel functions. It is a fairly low-hanging fruit to add user functions by using uprobes instead of kprobes. It should also be quite easy to add tracepoint support (using the current hacky approach or waiting for the 4.7 native support) and USDT probes as well.

I think in a lot of cases, stackcount on one of these data sources can replace more specialised tools. For example, if I want to know where my threads are blocked for mutexes, I'd use the pthread_mutex_lock USDT probe (or a uprobe on that function). If I want to know where threads are issuing a lot of block I/Os, I'd use the block:block_rq_issue tracepoint. And so on.

There's the more-or-less uniform syntax currently used by argdist and trace that I propose to use here as well. Something like this:

# stackcount submit_bio
# stackcount -p 285 p:c:malloc
# stackcount -p 285 u:/opt/node/node:gc__start
# stackcount t:sched:sched_switch

What do you think?

/cc @brendangregg

@brendangregg
Copy link
Member

Looks good to me.

At some point we'll have the BPF_PROG_TYPE_TRACEPOINT support in bcc, so tracepoints should get easier.

@goldshtn
Copy link
Collaborator Author

Now that we really have tracepoint support, it's actually harder than it seems because we don't have the struct pt_regs * ctx in tracepoints -- which means we can't get the stack trace unless we use the "old" approach for attaching to tracepoints. I have a WIP implementation that supports uprobes (in addition to the original implementation for kprobes), I wonder if we should just leave it at that.

By the way, the same considerations apply to funccount and funclatency as well -- user-space support would be great to have there, too. I think a reasonable approach if we only want uprobes + kprobes is adding a [-l LIBRARY] switch, e.g.:

# stackcount -l c -p 1952 malloc

@brendangregg: What do you think?

@4ast
Copy link
Member

4ast commented Jul 11, 2016

wait a sec. get_stackid() should be working as-is for tracepoints.
the kernel does it trick with pt_regs and suppies them correctly into bpf_get_stackid().
Just need to pass whatever tracepoint 'ctx' into bpf_get_stackid().
Is that suddenly broken or just hypothetical issue?

@brendangregg
Copy link
Member

Check it out: stacks from tracepoints:

# ./urandomread-test.py 
TIME(s)            COMM             PID    GOTBITS
23459954.705288000 dd               20943  8192
23459954.705396000 dd               20943  8192
23459954.705497999 dd               20943  8192
23459954.705635000 dd               20943  8192
23459954.705745999 dd               20943  8192
^C  urandom_read
  __vfs_read
  vfs_read
  sys_read
  entry_SYSCALL_64_fastpath
    5

Program

#!/usr/bin/python
#
# urandomread-stacks  Example of instrumenting a kernel tracepoint.
#                     For Linux, uses BCC, BPF. Embedded C.
#
# REQUIRES: Linux 4.7+ (BPF_PROG_TYPE_TRACEPOINT support).
#
# Test by running this, then in another shell, run:
#     dd if=/dev/urandom of=/dev/null bs=1k count=5
#
# Copyright 2016 Netflix, Inc.
# Licensed under the Apache License, Version 2.0 (the "License")

from __future__ import print_function
from bcc import BPF
import signal

# load BPF program
b = BPF(text="""
BPF_HASH(counts, int);
BPF_STACK_TRACE(stack_traces, 1024)

TRACEPOINT_PROBE(random, urandom_read) {
    // args is from /sys/kernel/debug/tracing/events/random/urandom_read/format
    int key = stack_traces.get_stackid(args, BPF_F_REUSE_STACKID);
    u64 zero = 0;
    u64 *val = counts.lookup_or_init(&key, &zero);
    (*val)++;

    bpf_trace_printk("%d\\n", args->got_bits);
    return 0;
};
""")

# header
print("%-18s %-16s %-6s %s" % ("TIME(s)", "COMM", "PID", "GOTBITS"))

# print stacks on Ctrl-C
def print_stacks(signal, frame):
    counts = b["counts"]
    stack_traces = b["stack_traces"]
    for k, v in sorted(counts.items(), key=lambda counts: counts[1].value):
        for addr in stack_traces.walk(k.value):
            print("  %s" % b.ksym(addr))
        print("    %d\n" % v.value)
    exit()
signal.signal(signal.SIGINT, print_stacks) 

# format output
while 1:
    try:
        (task, pid, cpu, flags, ts, msg) = b.trace_fields()
    except ValueError:
        continue
    print("%-18.9f %-16s %-6d %s" % (ts, task, pid, msg))

So I just used args as ctx, and it worked.

@goldshtn not sure about library switches (-l), since the precedent is to use prefixes on the probe name directly.

@goldshtn
Copy link
Collaborator Author

@4ast: I just assumed it wouldn't work. Very happy to learn that it does. Sorry :)

@brendangregg: OK, sounds reasonable. There are a couple of prerequisites to make it happen, namely need to support uprobe, tracepoint, and USDT attach with regex.

@wenbinzeng
Copy link

dd if=/dev/urandom of=/dev/null bs=1k count=5

@brendangregg @4ast
I wonder if this feature depends on certain kernel versions? This program worked well on my 5.6.6-300.fc32.x86_64 kernel, but didn't work on CentOS 8 kernel version 4.18.0-147.8.1.el8_1.x86_64, get_stackid returned -14, output looks like:

./urand.py

TIME(s) COMM PID GOTBITS
3565.830200000 b'dd' 12745 b'8192'
3565.830220000 b'dd' 12745 b'stackid: -14'
3565.830242000 b'dd' 12745 b'8192'
3565.830243000 b'dd' 12745 b'stackid: -14'
3565.830262000 b'dd' 12745 b'8192'
3565.830271000 b'dd' 12745 b'stackid: -14'
3565.830291000 b'dd' 12745 b'8192'
3565.830293000 b'dd' 12745 b'stackid: -14'
3565.830313000 b'dd' 12745 b'8192'
3565.830315000 b'dd' 12745 b'stackid: -14'
^CTraceback (most recent call last):
File "./urand.py", line 54, in
(task, pid, cpu, flags, ts, msg) = b.trace_fields()
File "/usr/lib/python3.6/site-packages/bcc/init.py", line 1221, in trace_fields
line = self.trace_readline(nonblocking)
File "/usr/lib/python3.6/site-packages/bcc/init.py", line 1253, in trace_readline
line = trace.readline(1024).rstrip()
File "./urand.py", line 45, in print_stacks
for addr in stack_traces.walk(k.value):
File "/usr/lib/python3.6/site-packages/bcc/table.py", line 896, in walk
return StackTrace.StackWalker(self[self.Key(stack_id)], self.flags, resolve)
File "/usr/lib/python3.6/site-packages/bcc/table.py", line 250, in getitem
raise KeyError
KeyError

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants