Permalink
Cannot retrieve contributors at this time
963 lines (836 sloc)
21.5 KB
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
bcc/tools/stackcount_example.txt
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Demonstrations of stackcount, the Linux eBPF/bcc version. | |
| This program traces functions and frequency counts them with their entire | |
| stack trace, summarized in-kernel for efficiency. For example, counting | |
| stack traces that led to the submit_bio() kernel function, which creates | |
| block device I/O: | |
| # ./stackcount submit_bio | |
| Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. | |
| ^C | |
| submit_bio | |
| submit_bh | |
| journal_submit_commit_record.isra.13 | |
| jbd2_journal_commit_transaction | |
| kjournald2 | |
| kthread | |
| ret_from_fork | |
| mb_cache_list | |
| 1 | |
| submit_bio | |
| __block_write_full_page.constprop.39 | |
| block_write_full_page | |
| blkdev_writepage | |
| __writepage | |
| write_cache_pages | |
| generic_writepages | |
| do_writepages | |
| __writeback_single_inode | |
| writeback_sb_inodes | |
| __writeback_inodes_wb | |
| 2 | |
| submit_bio | |
| __block_write_full_page.constprop.39 | |
| block_write_full_page | |
| blkdev_writepage | |
| __writepage | |
| write_cache_pages | |
| generic_writepages | |
| do_writepages | |
| __filemap_fdatawrite_range | |
| filemap_fdatawrite | |
| fdatawrite_one_bdev | |
| 36 | |
| submit_bio | |
| submit_bh | |
| jbd2_journal_commit_transaction | |
| kjournald2 | |
| kthread | |
| ret_from_fork | |
| mb_cache_list | |
| 38 | |
| submit_bio | |
| ext4_writepages | |
| do_writepages | |
| __filemap_fdatawrite_range | |
| filemap_flush | |
| ext4_alloc_da_blocks | |
| ext4_rename | |
| ext4_rename2 | |
| vfs_rename | |
| sys_rename | |
| entry_SYSCALL_64_fastpath | |
| 79 | |
| Detaching... | |
| The output shows unique stack traces, in order from leaf (on-CPU) to root, | |
| followed by their occurrence count. The last stack trace in the above output | |
| shows syscall handling, ext4_rename(), and filemap_flush(): looks like an | |
| application issued file rename has caused back end disk I/O due to ext4 | |
| block allocation and a filemap_flush(). | |
| Now adding the -P option to display stacks separately for each process: | |
| # ./stackcount -P submit_bio | |
| Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. | |
| ^C | |
| submit_bio | |
| ext4_writepages | |
| do_writepages | |
| __filemap_fdatawrite_range | |
| filemap_flush | |
| ext4_alloc_da_blocks | |
| ext4_release_file | |
| __fput | |
| ____fput | |
| task_work_run | |
| exit_to_usermode_loop | |
| syscall_return_slowpath | |
| entry_SYSCALL_64_fastpath | |
| [unknown] | |
| [unknown] | |
| tar [15069] | |
| 5 | |
| submit_bio | |
| ext4_bio_write_page | |
| mpage_submit_page | |
| mpage_map_and_submit_buffers | |
| ext4_writepages | |
| do_writepages | |
| __filemap_fdatawrite_range | |
| filemap_flush | |
| ext4_alloc_da_blocks | |
| ext4_release_file | |
| __fput | |
| ____fput | |
| task_work_run | |
| exit_to_usermode_loop | |
| syscall_return_slowpath | |
| entry_SYSCALL_64_fastpath | |
| [unknown] | |
| [unknown] | |
| tar [15069] | |
| 15 | |
| submit_bio | |
| ext4_readpages | |
| __do_page_cache_readahead | |
| ondemand_readahead | |
| page_cache_async_readahead | |
| generic_file_read_iter | |
| __vfs_read | |
| vfs_read | |
| sys_read | |
| entry_SYSCALL_64_fastpath | |
| [unknown] | |
| tar [15069] | |
| 113 | |
| Detaching... | |
| The last stack trace in the above output shows syscall handling, sys_read(), | |
| vfs_read(), and then "readahead" functions: looks like an application issued | |
| file read has triggered read ahead. With "-P", the application can be seen | |
| after the stack trace, in this case, "tar [15069]" for the "tar" command, | |
| PID 15069. | |
| The order of printed stack traces is from least to most frequent. The most | |
| frequent in this case, the ext4_readpages() stack, was taken 113 times during | |
| tracing. | |
| The "[unknown]" frames are from user-level, since this simple workload is | |
| the tar command, which apparently has been compiled without frame pointers. | |
| It's a common compiler optimization, but it breaks frame pointer-based stack | |
| walkers. Similar broken stacks will be seen by other profilers and debuggers | |
| that use frame pointers. Hopefully your application preserves them so that | |
| the user-level stack trace is visible. So how does one get frame pointers, if | |
| your application doesn't have them to start with? For the current bcc (until | |
| it supports other stack walkers), you need to be running an application binaries | |
| that preserves frame pointers, eg, using gcc's -fno-omit-frame-pointer. That's | |
| about all I'll say here: this is a big topic that is not bcc/BPF specific. | |
| It can be useful to trace the path to submit_bio to explain unusual rates of | |
| disk IOPS. These could have in-kernel origins (eg, background scrub). | |
| Now adding the -d option to delimit kernel and user stacks: | |
| # ./stackcount -P -d submit_bio | |
| Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. | |
| ^C | |
| submit_bio | |
| submit_bh | |
| journal_submit_commit_record | |
| jbd2_journal_commit_transaction | |
| kjournald2 | |
| kthread | |
| ret_from_fork | |
| -- | |
| jbd2/xvda1-8 [405] | |
| 1 | |
| submit_bio | |
| submit_bh | |
| jbd2_journal_commit_transaction | |
| kjournald2 | |
| kthread | |
| ret_from_fork | |
| -- | |
| jbd2/xvda1-8 [405] | |
| 2 | |
| submit_bio | |
| ext4_writepages | |
| do_writepages | |
| __filemap_fdatawrite_range | |
| filemap_flush | |
| ext4_alloc_da_blocks | |
| ext4_release_file | |
| __fput | |
| ____fput | |
| task_work_run | |
| exit_to_usermode_loop | |
| syscall_return_slowpath | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| [unknown] | |
| [unknown] | |
| tar [15187] | |
| 5 | |
| submit_bio | |
| ext4_bio_write_page | |
| mpage_submit_page | |
| mpage_map_and_submit_buffers | |
| ext4_writepages | |
| do_writepages | |
| __filemap_fdatawrite_range | |
| filemap_flush | |
| ext4_alloc_da_blocks | |
| ext4_release_file | |
| __fput | |
| ____fput | |
| task_work_run | |
| exit_to_usermode_loop | |
| syscall_return_slowpath | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| [unknown] | |
| [unknown] | |
| tar [15187] | |
| 15 | |
| submit_bio | |
| ext4_readpages | |
| __do_page_cache_readahead | |
| ondemand_readahead | |
| page_cache_async_readahead | |
| generic_file_read_iter | |
| __vfs_read | |
| vfs_read | |
| sys_read | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| [unknown] | |
| [unknown] | |
| [unknown] | |
| tar [15187] | |
| 171 | |
| Detaching... | |
| A "--" is printed between the kernel and user stacks. | |
| As a different example, here is the kernel function hrtimer_init_sleeper(): | |
| # ./stackcount.py -P -d hrtimer_init_sleeper | |
| Tracing 1 functions for "hrtimer_init_sleeper"... Hit Ctrl-C to end. | |
| ^C | |
| hrtimer_init_sleeper | |
| do_futex | |
| SyS_futex | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| [unknown] | |
| containerd [16020] | |
| 1 | |
| hrtimer_init_sleeper | |
| do_futex | |
| SyS_futex | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| __pthread_cond_timedwait | |
| Monitor::IWait(Thread*, long) | |
| Monitor::wait(bool, long, bool) | |
| CompileQueue::get() | |
| CompileBroker::compiler_thread_loop() | |
| JavaThread::thread_main_inner() | |
| JavaThread::run() | |
| java_start(Thread*) | |
| start_thread | |
| java [4996] | |
| 1 | |
| hrtimer_init_sleeper | |
| do_futex | |
| SyS_futex | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| [unknown] | |
| [unknown] | |
| containerd [16020] | |
| 1 | |
| hrtimer_init_sleeper | |
| do_futex | |
| SyS_futex | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| __pthread_cond_timedwait | |
| VMThread::loop() | |
| VMThread::run() | |
| java_start(Thread*) | |
| start_thread | |
| java [4996] | |
| 3 | |
| hrtimer_init_sleeper | |
| do_futex | |
| SyS_futex | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| [unknown] | |
| dockerd [16008] | |
| 4 | |
| hrtimer_init_sleeper | |
| do_futex | |
| SyS_futex | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| [unknown] | |
| [unknown] | |
| dockerd [16008] | |
| 4 | |
| hrtimer_init_sleeper | |
| do_futex | |
| SyS_futex | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| __pthread_cond_timedwait | |
| Lio/netty/util/ThreadDeathWatcher$Watcher;::run | |
| Interpreter | |
| Interpreter | |
| call_stub | |
| JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*) | |
| JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*) | |
| JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*) | |
| thread_entry(JavaThread*, Thread*) | |
| JavaThread::thread_main_inner() | |
| JavaThread::run() | |
| java_start(Thread*) | |
| start_thread | |
| java [4996] | |
| 4 | |
| hrtimer_init_sleeper | |
| do_futex | |
| SyS_futex | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| __pthread_cond_timedwait | |
| clock_gettime | |
| [unknown] | |
| java [4996] | |
| 79 | |
| Detaching... | |
| I was just trying to find a more interesting example. This output includes | |
| some Java stacks where user-level has been walked correctly (even includes a | |
| JIT symbol translation). dockerd and containerd don't have frame pointers | |
| (grumble), but Java does (which is running with -XX:+PreserveFramePointer). | |
| Here's another kernel function, ip_output(): | |
| # ./stackcount.py -P -d ip_output | |
| Tracing 1 functions for "ip_output"... Hit Ctrl-C to end. | |
| ^C | |
| ip_output | |
| ip_queue_xmit | |
| tcp_transmit_skb | |
| tcp_write_xmit | |
| __tcp_push_pending_frames | |
| tcp_push | |
| tcp_sendmsg | |
| inet_sendmsg | |
| sock_sendmsg | |
| sock_write_iter | |
| __vfs_write | |
| vfs_write | |
| SyS_write | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| __write_nocancel | |
| [unknown] | |
| sshd [15015] | |
| 5 | |
| ip_output | |
| ip_queue_xmit | |
| tcp_transmit_skb | |
| tcp_write_xmit | |
| __tcp_push_pending_frames | |
| tcp_push | |
| tcp_sendmsg | |
| inet_sendmsg | |
| sock_sendmsg | |
| sock_write_iter | |
| __vfs_write | |
| vfs_write | |
| SyS_write | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| __write_nocancel | |
| [unknown] | |
| [unknown] | |
| sshd [8234] | |
| 5 | |
| ip_output | |
| ip_queue_xmit | |
| tcp_transmit_skb | |
| tcp_write_xmit | |
| __tcp_push_pending_frames | |
| tcp_push | |
| tcp_sendmsg | |
| inet_sendmsg | |
| sock_sendmsg | |
| sock_write_iter | |
| __vfs_write | |
| vfs_write | |
| SyS_write | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| __write_nocancel | |
| sshd [15015] | |
| 7 | |
| Detaching... | |
| This time just sshd is triggering ip_output() calls. | |
| Watch what happens if I filter on kernel stacks only (-K) for ip_output(): | |
| # ./stackcount.py -K ip_output | |
| Tracing 1 functions for "ip_output"... Hit Ctrl-C to end. | |
| ^C | |
| ip_output | |
| ip_queue_xmit | |
| tcp_transmit_skb | |
| tcp_write_xmit | |
| __tcp_push_pending_frames | |
| tcp_push | |
| tcp_sendmsg | |
| inet_sendmsg | |
| sock_sendmsg | |
| sock_write_iter | |
| __vfs_write | |
| vfs_write | |
| SyS_write | |
| entry_SYSCALL_64_fastpath | |
| 13 | |
| Detaching... | |
| They have grouped together as a single unique stack, since the kernel part | |
| was the same. | |
| Here is just the user stacks, fetched during the kernel function ip_output(): | |
| # ./stackcount.py -P -U ip_output | |
| Tracing 1 functions for "ip_output"... Hit Ctrl-C to end. | |
| ^C | |
| [unknown] | |
| snmpd [1645] | |
| 1 | |
| __write_nocancel | |
| [unknown] | |
| [unknown] | |
| sshd [8234] | |
| 3 | |
| __write_nocancel | |
| sshd [15015] | |
| 4 | |
| I should really run a custom sshd with frame pointers so we can see its | |
| stack trace... | |
| User-space functions can also be traced if a library name is provided. For | |
| example, to quickly identify code locations that allocate heap memory for | |
| PID 4902 (using -p), by tracing malloc from libc ("c:malloc"): | |
| # ./stackcount -p 4902 c:malloc | |
| Tracing 1 functions for "malloc"... Hit Ctrl-C to end. | |
| ^C | |
| malloc | |
| rbtree_new | |
| main | |
| [unknown] | |
| 12 | |
| malloc | |
| _rbtree_node_new_internal | |
| _rbtree_node_insert | |
| rbtree_insert | |
| main | |
| [unknown] | |
| 1189 | |
| Detaching... | |
| Kernel stacks are absent as this didn't enter kernel code. | |
| Note that user-space uses of stackcount can be somewhat more limited because | |
| a lot of user-space libraries and binaries are compiled without frame-pointers | |
| as discussed earlier (the -fomit-frame-pointer compiler default) or are used | |
| without debuginfo. | |
| In addition to kernel and user-space functions, kernel tracepoints and USDT | |
| tracepoints are also supported. | |
| For example, to determine where threads are being created in a particular | |
| process, use the pthread_create USDT tracepoint: | |
| # ./stackcount -P -p $(pidof parprimes) u:pthread:pthread_create | |
| Tracing 1 functions for "u:pthread:pthread_create"... Hit Ctrl-C to end. | |
| ^C | |
| parprimes [11923] | |
| pthread_create@@GLIBC_2.2.5 | |
| main | |
| __libc_start_main | |
| [unknown] | |
| 7 | |
| You can use "readelf -n file" to see if it has USDT tracepoints. | |
| Similarly, to determine where context switching is happening in the kernel, | |
| use the sched:sched_switch kernel tracepoint: | |
| # ./stackcount -P t:sched:sched_switch | |
| __schedule | |
| schedule | |
| worker_thread | |
| kthread | |
| ret_from_fork | |
| kworker/0:2 [25482] | |
| 1 | |
| __schedule | |
| schedule | |
| schedule_hrtimeout_range_clock | |
| schedule_hrtimeout_range | |
| ep_poll | |
| SyS_epoll_wait | |
| entry_SYSCALL_64_fastpath | |
| epoll_wait | |
| Lsun/nio/ch/SelectorImpl;::lockAndDoSelect | |
| Lsun/nio/ch/SelectorImpl;::select | |
| Lio/netty/channel/nio/NioEventLoop;::select | |
| Lio/netty/channel/nio/NioEventLoop;::run | |
| Interpreter | |
| Interpreter | |
| call_stub | |
| JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*) | |
| JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*) | |
| JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*) | |
| thread_entry(JavaThread*, Thread*) | |
| JavaThread::thread_main_inner() | |
| JavaThread::run() | |
| java_start(Thread*) | |
| start_thread | |
| java [4996] | |
| 1 | |
| ... (omitted for brevity) | |
| __schedule | |
| schedule | |
| schedule_preempt_disabled | |
| cpu_startup_entry | |
| xen_play_dead | |
| arch_cpu_idle_dead | |
| cpu_startup_entry | |
| cpu_bringup_and_idle | |
| swapper/1 [0] | |
| 289 | |
| A -i option can be used to set an output interval, and -T to include a | |
| timestamp. For example: | |
| # ./stackcount.py -P -Tdi 1 submit_bio | |
| Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. | |
| 06:05:13 | |
| 06:05:14 | |
| submit_bio | |
| xfs_do_writepage | |
| write_cache_pages | |
| xfs_vm_writepages | |
| do_writepages | |
| __writeback_single_inode | |
| writeback_sb_inodes | |
| __writeback_inodes_wb | |
| wb_writeback | |
| wb_workfn | |
| process_one_work | |
| worker_thread | |
| kthread | |
| ret_from_fork | |
| -- | |
| kworker/u16:1 [15686] | |
| 1 | |
| submit_bio | |
| process_one_work | |
| worker_thread | |
| kthread | |
| ret_from_fork | |
| -- | |
| kworker/u16:0 [16007] | |
| 1 | |
| submit_bio | |
| xfs_buf_submit | |
| xlog_bdstrat | |
| xlog_sync | |
| xlog_state_release_iclog | |
| _xfs_log_force | |
| xfs_log_force | |
| xfs_fs_sync_fs | |
| sync_fs_one_sb | |
| iterate_supers | |
| sys_sync | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| [unknown] | |
| sync [16039] | |
| 1 | |
| submit_bio | |
| submit_bh | |
| journal_submit_commit_record | |
| jbd2_journal_commit_transaction | |
| kjournald2 | |
| kthread | |
| ret_from_fork | |
| -- | |
| jbd2/xvda1-8 [405] | |
| 1 | |
| submit_bio | |
| process_one_work | |
| worker_thread | |
| kthread | |
| ret_from_fork | |
| -- | |
| kworker/0:2 [25482] | |
| 2 | |
| submit_bio | |
| ext4_writepages | |
| do_writepages | |
| __writeback_single_inode | |
| writeback_sb_inodes | |
| __writeback_inodes_wb | |
| wb_writeback | |
| wb_workfn | |
| process_one_work | |
| worker_thread | |
| kthread | |
| ret_from_fork | |
| -- | |
| kworker/u16:0 [16007] | |
| 4 | |
| submit_bio | |
| xfs_vm_writepages | |
| do_writepages | |
| __writeback_single_inode | |
| writeback_sb_inodes | |
| __writeback_inodes_wb | |
| wb_writeback | |
| wb_workfn | |
| process_one_work | |
| worker_thread | |
| kthread | |
| ret_from_fork | |
| -- | |
| kworker/u16:1 [15686] | |
| 5 | |
| submit_bio | |
| __block_write_full_page | |
| block_write_full_page | |
| blkdev_writepage | |
| __writepage | |
| write_cache_pages | |
| generic_writepages | |
| blkdev_writepages | |
| do_writepages | |
| __filemap_fdatawrite_range | |
| filemap_fdatawrite | |
| fdatawrite_one_bdev | |
| iterate_bdevs | |
| sys_sync | |
| entry_SYSCALL_64_fastpath | |
| -- | |
| [unknown] | |
| sync [16039] | |
| 7 | |
| submit_bio | |
| submit_bh | |
| jbd2_journal_commit_transaction | |
| kjournald2 | |
| kthread | |
| ret_from_fork | |
| -- | |
| jbd2/xvda1-8 [405] | |
| 8 | |
| submit_bio | |
| ext4_bio_write_page | |
| mpage_submit_page | |
| mpage_map_and_submit_buffers | |
| ext4_writepages | |
| do_writepages | |
| __writeback_single_inode | |
| writeback_sb_inodes | |
| __writeback_inodes_wb | |
| wb_writeback | |
| wb_workfn | |
| process_one_work | |
| worker_thread | |
| kthread | |
| ret_from_fork | |
| -- | |
| kworker/u16:0 [16007] | |
| 8 | |
| submit_bio | |
| __block_write_full_page | |
| block_write_full_page | |
| blkdev_writepage | |
| __writepage | |
| write_cache_pages | |
| generic_writepages | |
| blkdev_writepages | |
| do_writepages | |
| __writeback_single_inode | |
| writeback_sb_inodes | |
| __writeback_inodes_wb | |
| wb_writeback | |
| wb_workfn | |
| process_one_work | |
| worker_thread | |
| kthread | |
| ret_from_fork | |
| -- | |
| kworker/u16:0 [16007] | |
| 60 | |
| 06:05:15 | |
| 06:05:16 | |
| Detaching... | |
| This only included output for the 06:05:14 interval. The other internals | |
| did not span block device I/O. | |
| The -s output prints the return instruction offset for each function (aka | |
| symbol offset). Eg: | |
| # ./stackcount.py -P -s tcp_sendmsg | |
| Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end. | |
| ^C | |
| tcp_sendmsg+0x1 | |
| sock_sendmsg+0x38 | |
| sock_write_iter+0x85 | |
| __vfs_write+0xe3 | |
| vfs_write+0xb8 | |
| SyS_write+0x55 | |
| entry_SYSCALL_64_fastpath+0x1e | |
| __write_nocancel+0x7 | |
| sshd [15015] | |
| 3 | |
| tcp_sendmsg+0x1 | |
| sock_sendmsg+0x38 | |
| sock_write_iter+0x85 | |
| __vfs_write+0xe3 | |
| vfs_write+0xb8 | |
| SyS_write+0x55 | |
| entry_SYSCALL_64_fastpath+0x1e | |
| __write_nocancel+0x7 | |
| sshd [8234] | |
| 3 | |
| Detaching... | |
| If it wasn't clear how one function called another, knowing the instruction | |
| offset can help you locate the lines of code from a disassembly dump. | |
| The -v output is verbose, and shows raw addresses: | |
| ./stackcount.py -P -v tcp_sendmsg | |
| Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end. | |
| ^C | |
| ffffffff817b05c1 tcp_sendmsg | |
| ffffffff8173ea48 sock_sendmsg | |
| ffffffff8173eae5 sock_write_iter | |
| ffffffff81232b33 __vfs_write | |
| ffffffff812331b8 vfs_write | |
| ffffffff81234625 SyS_write | |
| ffffffff818739bb entry_SYSCALL_64_fastpath | |
| 7f120511e6e0 __write_nocancel | |
| sshd [8234] | |
| 3 | |
| ffffffff817b05c1 tcp_sendmsg | |
| ffffffff8173ea48 sock_sendmsg | |
| ffffffff8173eae5 sock_write_iter | |
| ffffffff81232b33 __vfs_write | |
| ffffffff812331b8 vfs_write | |
| ffffffff81234625 SyS_write | |
| ffffffff818739bb entry_SYSCALL_64_fastpath | |
| 7f919c5a26e0 __write_nocancel | |
| sshd [15015] | |
| 11 | |
| Detaching... | |
| A wildcard can also be used. Eg, all functions beginning with "tcp_send", | |
| kernel stacks only (-K) with offsets (-s): | |
| # ./stackcount -Ks 'tcp_send*' | |
| Tracing 14 functions for "tcp_send*"... Hit Ctrl-C to end. | |
| ^C | |
| tcp_send_delayed_ack0x1 | |
| tcp_rcv_established0x3b1 | |
| tcp_v4_do_rcv0x130 | |
| tcp_v4_rcv0x8e0 | |
| ip_local_deliver_finish0x9f | |
| ip_local_deliver0x51 | |
| ip_rcv_finish0x8a | |
| ip_rcv0x29d | |
| __netif_receive_skb_core0x637 | |
| __netif_receive_skb0x18 | |
| netif_receive_skb_internal0x23 | |
| 1 | |
| tcp_send_delayed_ack0x1 | |
| tcp_rcv_established0x222 | |
| tcp_v4_do_rcv0x130 | |
| tcp_v4_rcv0x8e0 | |
| ip_local_deliver_finish0x9f | |
| ip_local_deliver0x51 | |
| ip_rcv_finish0x8a | |
| ip_rcv0x29d | |
| __netif_receive_skb_core0x637 | |
| __netif_receive_skb0x18 | |
| netif_receive_skb_internal0x23 | |
| 4 | |
| tcp_send_mss0x1 | |
| inet_sendmsg0x67 | |
| sock_sendmsg0x38 | |
| sock_write_iter0x78 | |
| __vfs_write0xaa | |
| vfs_write0xa9 | |
| sys_write0x46 | |
| entry_SYSCALL_64_fastpath0x16 | |
| 7 | |
| tcp_sendmsg0x1 | |
| sock_sendmsg0x38 | |
| sock_write_iter0x78 | |
| __vfs_write0xaa | |
| vfs_write0xa9 | |
| sys_write0x46 | |
| entry_SYSCALL_64_fastpath0x16 | |
| 7 | |
| Detaching... | |
| Use -r to allow regular expressions. | |
| The -f option will emit folded output, which can be used as input to other | |
| tools including flame graphs. For example, with delimiters as well: | |
| # ./stackcount.py -P -df t:sched:sched_switch | |
| ^Csnmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1 | |
| kworker/7:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1 | |
| watchdog/0;-;ret_from_fork;kthread;smpboot_thread_fn;schedule;__schedule 1 | |
| snmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1 | |
| svscan;[unknown];-;entry_SYSCALL_64_fastpath;SyS_nanosleep;hrtimer_nanosleep;do_nanosleep;schedule;__schedule 1 | |
| python;[unknown];__select_nocancel;-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1 | |
| kworker/2:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1 | |
| [...] | |
| Flame graphs visualize stack traces. For information about them and links | |
| to open source software, see http://www.brendangregg.com/flamegraphs.html . | |
| This folded output can be piped directly into flamegraph.pl (the Perl version). | |
| USAGE message: | |
| # ./stackcount -h | |
| usage: stackcount [-h] [-p PID] [-c CPU] [-i INTERVAL] [-D DURATION] [-T] [-r] | |
| [-s] [-P] [-K] [-U] [-v] [-d] [-f] [--debug] | |
| pattern | |
| Count events and their stack traces | |
| positional arguments: | |
| pattern search expression for events | |
| optional arguments: | |
| -h, --help show this help message and exit | |
| -p PID, --pid PID trace this PID only | |
| -c CPU, --cpu CPU trace this CPU only | |
| -i INTERVAL, --interval INTERVAL | |
| summary interval, seconds | |
| -D DURATION, --duration DURATION | |
| total duration of trace, seconds | |
| -T, --timestamp include timestamp on output | |
| -r, --regexp use regular expressions. Default is "*" wildcards | |
| only. | |
| -s, --offset show address offsets | |
| -P, --perpid display stacks separately for each process | |
| -K, --kernel-stacks-only | |
| kernel stack only | |
| -U, --user-stacks-only | |
| user stack only | |
| -v, --verbose show raw addresses | |
| -d, --delimited insert delimiter between kernel/user stacks | |
| -f, --folded output folded format | |
| --debug print BPF program before starting (for debugging | |
| purposes) | |
| examples: | |
| ./stackcount submit_bio # count kernel stack traces for submit_bio | |
| ./stackcount -d ip_output # include a user/kernel stack delimiter | |
| ./stackcount -s ip_output # show symbol offsets | |
| ./stackcount -sv ip_output # show offsets and raw addresses (verbose) | |
| ./stackcount 'tcp_send*' # count stacks for funcs matching tcp_send* | |
| ./stackcount -r '^tcp_send.*' # same as above, using regular expressions | |
| ./stackcount -Ti 5 ip_output # output every 5 seconds, with timestamps | |
| ./stackcount -p 185 ip_output # count ip_output stacks for PID 185 only | |
| ./stackcount -p 185 c:malloc # count stacks for malloc in PID 185 | |
| ./stackcount t:sched:sched_fork # count stacks for sched_fork tracepoint | |
| ./stackcount -p 185 u:node:* # count stacks for all USDT probes in node | |
| ./stackcount -c 1 put_prev_entity # count put_prev_entity stacks for CPU 1 only | |
| ./stackcount -K t:sched:sched_switch # kernel stacks only | |
| ./stackcount -U t:sched:sched_switch # user stacks only |