Skip to content

Conversation

@tperkins-perforce
Copy link
Contributor

Background

There'a set of bpftrace files that are failing with blk_mq_end_request for kprobe. This fix is changing the kprobe to use a higher level function using raw tracepoints.

Problem

After upgrading to 24.04, the stbtrace io command fails to generate any output.

Solution

Using raw tracepoints and changing the callback to be traced was changed from blk_mq_start_request to block_io_start, and blk_mq_end_request to block_io_done. The underlying kernel functions did not always call blk_mq_end_request, so the output for the kprobe was not always hit with a corresponding kprobe on blk_mq_start_request.

This way we can get the raw tracepoint of block_io_start/block_io_done, which is always called.

Testing Done

Running fio, iostat, and estat related tools.

Implementation

Changed kprobes to raw tracepoints.

@tperkins-perforce tperkins-perforce force-pushed the dlpx/pr/tperkins-perforce/2f98ff32-584c-4690-8d24-3c87fe9b958b branch from 46a8eaf to 17a57e1 Compare June 10, 2025 18:09
@tperkins-perforce tperkins-perforce changed the title DLPX-99371: test_stbtrace_io failed because it stbtrace did not produce any output DLPX-99371 test_stbtrace_io failed because it stbtrace did not produce any output Jun 10, 2025
Copy link
Contributor

@prakashsurya prakashsurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@tperkins-perforce tperkins-perforce requested a review from sebroy June 10, 2025 18:12
@tperkins-perforce tperkins-perforce marked this pull request as ready for review June 10, 2025 18:40
@sebroy
Copy link
Contributor

sebroy commented Jun 11, 2025

This looks great, thanks for persevering through this one. I had a thought while reviewing this for us to consider (perhaps as a next step following the integration of this PR):

It looks like we may have all of the metadata and metrics needed in the disk_io_done raw tracepoint itself without the need to attach a probe to disk_io_start. In the disk_io_done tracepoint, we have the full request structure, and that structure also contains the timestamp when the I/O request started. If we have all we need in the disk_io_done probe, then we could eliminate the overhead of having to attach anything to disk_io_start...

@sebroy
Copy link
Contributor

sebroy commented Jun 11, 2025

It looks like there's a typo in the PR summary, the Jira issue referenced doesn't exist. This'll have to be fixed so that the PR checks can pass.

@tperkins-perforce tperkins-perforce enabled auto-merge (squash) June 11, 2025 17:53
@tperkins-perforce tperkins-perforce changed the title DLPX-99371 test_stbtrace_io failed because it stbtrace did not produce any output DLPX-93371 test_stbtrace_io failed because it stbtrace did not produce any output Jun 11, 2025
@tperkins-perforce tperkins-perforce merged commit 341f590 into develop Jun 11, 2025
6 of 8 checks passed
@tperkins-perforce tperkins-perforce deleted the dlpx/pr/tperkins-perforce/2f98ff32-584c-4690-8d24-3c87fe9b958b branch June 11, 2025 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants