Skip to content

Commit

Permalink
biostacks.bt: use tracepoints instead of kprobes
Browse files Browse the repository at this point in the history
because since v5.17-rc1~116^2~83 function blk_account_io_start
became static and can't be used as a kprobe.
  • Loading branch information
makelinux committed Dec 26, 2023
1 parent b191111 commit 6cf2cba
Show file tree
Hide file tree
Showing 3 changed files with 52 additions and 64 deletions.
7 changes: 1 addition & 6 deletions man/man8/biostacks.bt.8
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,7 @@ This tool shows disk I/O latency histograms for each block I/O initialization
path. This can help reveal the reason for different latencies, as some may
be created by log flushing, others by application reads, etc.

This works by tracing the blk_account_io_start() and the
blk_start_request() or blk_mq_start_request() functions using dynamic
instrumentation.
Linux 5.0 removed the classic I/O scheduler, so the blk_start_request()
probe can be removed from the tool (just delete it). This tool may need other
maintenance to keep working if these functions change in later kernels.
This works by attaching to block_io_start and block_rq_issue tracepoints.

Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
Expand Down
18 changes: 8 additions & 10 deletions tools/biostacks.bt
Original file line number Diff line number Diff line change
Expand Up @@ -18,20 +18,18 @@ BEGIN
printf("Tracing block I/O with init stacks. Hit Ctrl-C to end.\n");
}

kprobe:blk_account_io_start,
kprobe:__blk_account_io_start
tracepoint:block:block_io_start
{
@reqstack[arg0] = kstack;
@reqts[arg0] = nsecs;
@reqstack[args.dev, args.sector] = kstack;
@reqts[args.dev, args.sector] = nsecs;
}

kprobe:blk_start_request,
kprobe:blk_mq_start_request
/@reqts[arg0]/
tracepoint:block:block_rq_issue
/@reqts[args.dev, args.sector]/
{
@usecs[@reqstack[arg0]] = hist(nsecs - @reqts[arg0]);
delete(@reqstack[arg0]);
delete(@reqts[arg0]);
@usecs[@reqstack[args.dev, args.sector]] = hist(nsecs - @reqts[args.dev, args.sector]);
delete(@reqstack[args.dev, args.sector]);
delete(@reqts[args.dev, args.sector]);
}

END
Expand Down
91 changes: 43 additions & 48 deletions tools/biostacks_example.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,58 +6,53 @@ that initiated the I/O. This can help explain disk I/O that is not directly
requested by applications (eg, metadata reads on writes, resilvering, etc).
For example:

# ./biostacks.bt
Attaching 5 probes...
# ./biostacks.bt
Attaching 4 probes...
Tracing block I/O with init stacks. Hit Ctrl-C to end.
^C

@usecs[
blk_account_io_start+1
blk_mq_make_request+1102
generic_make_request+292
submit_bio+115
_xfs_buf_ioapply+798
xfs_buf_submit+101
xlog_bdstrat+43
xlog_sync+705
xlog_state_release_iclog+108
_xfs_log_force+542
xfs_log_force+44
xfsaild+428
kthread+289
ret_from_fork+53
]:
[64K, 128K) 1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|

[...]
blk_mq_submit_bio+1166
blk_mq_submit_bio+1166
__submit_bio+139
submit_bio_noacct_nocheck+345
iomap_readahead+510
read_pages+133
page_cache_ra_order+665
filemap_get_pages+266
filemap_read+217
xfs_file_buffered_read+82
xfs_file_read_iter+119
vfs_read+510
__x64_sys_pread64+152
do_syscall_64+93
entry_SYSCALL_64_after_hwframe+110
]:
[2K, 4K) 1 |@@@@@@@@@@@@@ |
[4K, 8K) 4 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|

@usecs[
blk_account_io_start+1
blk_mq_make_request+707
generic_make_request+292
submit_bio+115
xfs_add_to_ioend+455
xfs_do_writepage+758
write_cache_pages+524
xfs_vm_writepages+190
do_writepages+75
__writeback_single_inode+69
writeback_sb_inodes+481
__writeback_inodes_wb+103
wb_writeback+625
wb_workfn+384
process_one_work+478
worker_thread+50
kthread+289
ret_from_fork+53
]:
[8K, 16K) 560 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16K, 32K) 218 |@@@@@@@@@@@@@@@@@@@@ |
[32K, 64K) 26 |@@ |
[64K, 128K) 2 | |
[128K, 256K) 53 |@@@@ |
[256K, 512K) 60 |@@@@@ |
blk_mq_submit_bio+1166
blk_mq_submit_bio+1166
__submit_bio+139
submit_bio_noacct_nocheck+653
dmcrypt_write+282
kthread+229
ret_from_fork+49
ret_from_fork_asm+27
]:
[256, 512) 1 | |
[512, 1K) 10 |@@@@@@@ |
[1K, 2K) 16 |@@@@@@@@@@@@ |
[2K, 4K) 20 |@@@@@@@@@@@@@@@ |
[4K, 8K) 46 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[8K, 16K) 48 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[16K, 32K) 63 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[32K, 64K) 66 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[64K, 128K) 51 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[128K, 256K) 47 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[256K, 512K) 1 | |

This output shows the most frequent stack was XFS writeback, with latencies
between 8 and 512 microseconds. The other stack included here shows an XFS
log sync.
This output shows the most frequent stack was dmcrypt_write,
with latencies between 256 nanoseconds and 512 microseconds.
The other stack included here shows a pread system call.

0 comments on commit 6cf2cba

Please sign in to comment.