New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uprobe: uretprobe speed up #6662
Conversation
Upstream branch: 14bb1e8 |
367dff7
to
73732be
Compare
Upstream branch: 2a702c2 |
d9e8c7e
to
04b7b68
Compare
73732be
to
8d01f9b
Compare
Upstream branch: 2a720cc |
04b7b68
to
87d55df
Compare
8d01f9b
to
0609f1e
Compare
Upstream branch: 45a683b |
87d55df
to
d33f3ae
Compare
0609f1e
to
8b25225
Compare
Upstream branch: 74c8edc |
d33f3ae
to
50cd74a
Compare
8b25225
to
4f22ee0
Compare
Upstream branch: 50e4957 |
50cd74a
to
79dfea1
Compare
4f22ee0
to
8a9a8e0
Compare
Upstream branch: b9404f7 |
79dfea1
to
a909801
Compare
8a9a8e0
to
2dceb20
Compare
Upstream branch: e478cf2 |
a909801
to
73105cd
Compare
072341c
to
3db8025
Compare
Upstream branch: 2a24e24 |
63e8e21
to
a4ef4e0
Compare
3db8025
to
dcd0c1f
Compare
Upstream branch: c186ed1 |
a4ef4e0
to
3197a4b
Compare
dcd0c1f
to
7569910
Compare
Upstream branch: ce09cbd |
3197a4b
to
8c2070a
Compare
7569910
to
8a1db0a
Compare
Upstream branch: 15ea39a |
8c2070a
to
4517a9b
Compare
8a1db0a
to
3c0ee70
Compare
Upstream branch: c07b4bc |
4517a9b
to
1cbaa50
Compare
3c0ee70
to
a31d181
Compare
Adding uretprobe syscall instead of trap to speed up return probe. At the moment the uretprobe setup/path is: - install entry uprobe - when the uprobe is hit, it overwrites probed function's return address on stack with address of the trampoline that contains breakpoint instruction - the breakpoint trap code handles the uretprobe consumers execution and jumps back to original return address This patch replaces the above trampoline's breakpoint instruction with new ureprobe syscall call. This syscall does exactly the same job as the trap with some more extra work: - syscall trampoline must save original value for rax/r11/rcx registers on stack - rax is set to syscall number and r11/rcx are changed and used by syscall instruction - the syscall code reads the original values of those registers and restore those values in task's pt_regs area Even with the extra registers handling code the having uretprobes handled by syscalls shows speed improvement. On Intel (11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz) current: base : 15.888 ± 0.033M/s uprobe-nop : 3.016 ± 0.000M/s uprobe-push : 2.832 ± 0.005M/s uprobe-ret : 1.104 ± 0.000M/s uretprobe-nop : 1.487 ± 0.000M/s uretprobe-push : 1.456 ± 0.000M/s uretprobe-ret : 0.816 ± 0.001M/s with the fix: base : 15.116 ± 0.045M/s uprobe-nop : 3.001 ± 0.045M/s uprobe-push : 2.831 ± 0.004M/s uprobe-ret : 1.102 ± 0.001M/s uretprobe-nop : 1.969 ± 0.001M/s < 32% speedup uretprobe-push : 1.905 ± 0.004M/s < 30% speedup uretprobe-ret : 0.933 ± 0.002M/s < 14% speedup On Amd (AMD Ryzen 7 5700U) current: base : 5.105 ± 0.003M/s uprobe-nop : 1.552 ± 0.002M/s uprobe-push : 1.408 ± 0.003M/s uprobe-ret : 0.827 ± 0.001M/s uretprobe-nop : 0.779 ± 0.001M/s uretprobe-push : 0.750 ± 0.001M/s uretprobe-ret : 0.539 ± 0.001M/s with the fix: base : 5.119 ± 0.002M/s uprobe-nop : 1.523 ± 0.003M/s uprobe-push : 1.384 ± 0.003M/s uprobe-ret : 0.826 ± 0.002M/s uretprobe-nop : 0.866 ± 0.002M/s < 11% speedup uretprobe-push : 0.826 ± 0.002M/s < 10% speedup uretprobe-ret : 0.581 ± 0.001M/s < 7% speedup Reviewed-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Add uretprobe test that compares register values before and after the uretprobe is hit. It also compares the register values seen from attached bpf program. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Adding test that creates uprobe consumer on uretprobe which changes some of the registers. Making sure the changed registers are propagated to the user space from the ureprobe trampoline on x86_64. To be able to do this, adding support to bpf_testmod to create uprobe via new attribute file: /sys/kernel/bpf_testmod_uprobe This file is expecting file offset and creates related uprobe on current process exe file and removes existing uprobe if offset is 0. The can be only single uprobe at any time. The uprobe has specific consumer that changes registers used in ureprobe trampoline and which are later checked in the test. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Upstream branch: 49b73fa |
1cbaa50
to
722e6e0
Compare
a31d181
to
3655a8f
Compare
At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=840535 expired. Closing PR. |
Pull request for series with
subject: uprobe: uretprobe speed up
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=838809