New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
riscv: initial ebpf support #4085
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to mention the kernel patch here anymore, which won't be accepted by kernel upstream,
besides it's a different problem that we need to fix
you really need to add slightly more messages to explain what's the PR aim for, and what's the status?
Thank you for your suggestion, @dlan17 , has been modified accordingly. Kindly verify. |
Add basic support for RISC-V64. Fix the issue iovisor#4081 . With this patch, running examples/hello_world.py can get the expected result. Signed-off-by: Mingzheng Xing <mingzheng.xing@gmail.com> Co-authored-by: Yixun Lan <dlan@gentoo.org>
Thanks for looking into this! I have a question concerning bpf_probe_read(). You wrote that
execsnoop tool seems to use only
I observed this in a VM (riscv64) with Ubuntu 22.04 and kernel 5.15.0-1014-generic. bcc was built from the sources with your patch included. So, I guess, the BPF program still tries to use bpf_probe_read() rather than bpf_probe_read_user() for some reason. Some fallback in cc/frontends/clang/b_frontend_action.cc, perhaps? I'll re-check this on a clean build of bcc with your patch, just in case I messed something up locally, but still. |
@dlan17 By the way, how did you enable debug like in #4081 (comment)? In some of my experiments with this fix, I saw a warning when trying to load certain BPF programs: |
Thanks for your comment. @euspectre This PR solves the basic support problem of BCC on RISCV64. It seems that we should open a new issue. |
the error is expected, but shouldn't block this PR
Yes, it still use bpf_probe_read(), you can enable debug, and dump the bpf code
there is an attempt to try to resolve in kernel side but people don't like it, see [0] [0] https://lore.kernel.org/bpf/YsUy8jBpt11zoc5E@infradead.org/T/ |
add debug in python code, for example you can edit /usr/lib/python-exec/python3.9/execsnoop,
the error come from llvm/clang, I'm no llvm expert, and haven't really looked into it.. |
just trace this down, it's problem in file : execsnoop int do_ret_sys_execve(struct pt_regs *ctx) any idea where this expansion come from? and if we can fix it, then will solve the problem.. |
Looks like it is here:
|
How about smth like this:
Not tested yet. |
hey, I've tested, it actually works (does fix the bpf_probe_read() problem), and there are two code lines which need to be fixed. but the variable 'has_overlap_kuaddr_' here is confusing, as my knowledge, the virtual address space in riscv64 does not have overlap address space, see kernel doc [0], and please correct me if I'm wrong here? there is already the logic to probe whether the system support separate k/u address, so how about we reusing it? static std::string check_bpf_probe_read_kernel(void) { so the logic I'd propose: ProbeVisitor::ProbeVisitor(ASTContext &C, Rewriter &rewriter, has_probe_read_separated = has_overlap_kuaddr_ || is_probe_read_kernel; [0] https://www.kernel.org/doc/html/latest/riscv/vm-layout.html |
@euspectre would you like to pursue the bpf_probe_read_{kernel/user}() implementation in bcc? and send a PR here? I would more than happy to help testing/reviewing the changes thanks |
Indeed.
Yes, I'll try it in a separate PR. However, I am still new to bcc and need some time to better understand the code. I read the discussion you mentioned (https://lore.kernel.org/bpf/YsUzX2IeNb%2Fu9VmN@infradead.org/) and, it seems, they suggest using bpf_probe_read_(user|kernel) unconditionally if they are available, and only try bpf_probe_read() as a fallback. This way, if I understand it correctly, all checks for overlapping memory areas could be removed along with the confusion they cause. Instead, we could check if bpf_probe_read_(user|kernel)[str] are available in the kernel. This should be possible to implement. |
I tried to examine into it. This is a mcjit riscv support issue. ExecutionEngine and EngineBuilder is the old mcjit api, which bcc seems to be using. It has no problem supporting jit into bpf code, but doesn't support jit into riscv native code and will never do so because it is being superceded by orc jit. A grep into llvm/lib/ExexutionEngine shows this warning come from https://github.com/llvm/llvm-project/blob/355532a1499aa9b13a89fb5b5caaba2344d57cd7/llvm/lib/ExecutionEngine/ExecutionEngine.cpp#L529. This confirms riscv has no mcjit support. So in Line 162 in 13b5563
I guess disabling ENABLE_LLVM_NATIVECODEGEN should work around this warning for now. In a long run if debug is needed, rw_engine may need to be migrated to orcji v2, which fully support riscv already. |
@chenhengqi and @dlan17 thanks for code reviews for this pull request and #4088. I don't have an environment for riscv, so I will not merge the patch unless one of you gave an approval. It is not necessary to have all support in one patch. But hopefully the first patch can provide some basic support at the least. |
@yonghong-song Yes, this PR provide some basic support for riscv64 platform. You can add a "Tested-by: Yixun Lan dlan@gentoo.org" if that work for you. The hardware I'm testing is a sifive unmatched board which is a riscv64 (rv64gc) platform, With this PR, examples/hello_world.py from bcc is just able to run. b' tmux: server-2642 [001] d...1 718598.584658: bpf_trace_printk: Hello, World!' If we apply extra patch either from [0] or [1] to fix the bpf_probe_read_{kernel,user}() issue, then we are able to run more bpf tools: like execsnoop, runqlen, and more.. [0] https://lore.kernel.org/20220703130924.57240-1-dlan@gentoo.org thanks |
@dlan17 I have prepared the fix for bpf_probe_read() issue in a separate PR: #4109. |
@xmzzz @dlan17 @chenhengqi With latest llvm-project release/15.x branch or main branch (llvm16), and the following bcc cmake
I got the following compilation error:
I haven't done any investigation yet. Could any of you help take a look? Thanks! |
clang::RISCV::RVVType::computeTypes is changed, see https://reviews.llvm.org/D126742 not sure why you got this error, but it sounds bcc is trying to link old llvm lib? btw, I'm here using the released version of llvm-14.0.6 |
@dlan17 I did some bisecting, the following llvm (llvm15 release branch or llvm16) triggered the failure:
The patch is merged on July 13, 2022. I think bcc/cmake/clang_libs.cmake needs updated with additional clang libs. Please help take a look. |
Okay, the patch #4154 should fix the problem. |
Add basic support for RISC-V64. Fix the issue iovisor#4081 . With this patch, running examples/hello_world.py can get the expected result. Tested-by: Yixun Lan dlan@gentoo.org Signed-off-by: Mingzheng Xing <mingzheng.xing@gmail.com> Co-authored-by: Yixun Lan <dlan@gentoo.org>
In order to fix the issue #4081 , some basic support for RISC-V64 has been added. With this patch, running examples/hello_world.py can get the expected result.
However, BCC tool using bpf_probe_read() helper, which will report an error on RISC-V64 because the ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE config option in kernel is not set. This will be resolved by updating the BCC tools and BCC itself with the appropriate bpf_probe_read_{kernel,user}[_str()] helpers.
This PR currently only supports RISCV64.
Signed-off-by: Mingzheng Xing mingzheng.xing@gmail.com
Co-authored-by: Yixun Lan dlan@gentoo.org