-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Userspace stack is not unwinded in most samples with offcputime.py #1641
Comments
I'd do a normal perf on-CPU profile and see if that works:
If the stacks are broken there, then it's not a problem with bcc/BPF, and would be other problems (like glibc). |
I think I should recompile the kernel with frame pointers explicitly enabled. Currently config says
Seems like I need to say |
Okay, I've recompiled the kernel:
But nothing changed, stacks are still not resolved. Any ideas? |
How about testing the same off-CPU events? Eg:
|
Here [1] is the results of
It shows that a lot of context switches happened inside
There is no such backtrace at [1]. There is one relatively large stack (taking As an experiment, I tried to generate off-cpu graph using The moral is that this is indeed not bcc/bpf-specific problem. Stacks caught during off-cpu time analysis are almost always not unwinded, while stacks caught with usual on-cpu profiling (e.g. [1] https://oc.postgrespro.ru/index.php/s/v6zNMbHxHOyMXqe |
@arssher maybe you could help by creating a simple representative application which can reproduce your stack tracing behavior and folks here can help? Maybe indeed it is kernel related. |
Ok, here is the simple example which hopefully reproduces the sample problem. All the program at the bottom does is repeating 3 times 1) do some pointless number crunching 2) sleep 3 seconds. I run it with
If I invoke
It gives wrong stacks:
(See the flamegraph at [1]). On the other had, on-cpu profiling with perf with
Gives the proper stack
Here is the program:
|
Can you try the following patch:
Basically, disable |
Unfortunately, seems like nothing changed:
Still bogus |
Fix issue #1641 The bcc user space stack is not printed out properly. From https://en.wikipedia.org/wiki/Name_mangling, all mangled symbols begin with _Z, and an identifier beginning with an underscore followed by a capital is a reserved identifier in C, so conflict with user identifiers is avoided. Further, from the llvm demangle code https://github.com/llvm-mirror/libcxxabi/blob/master/src/cxa_demangle.cpp The demangled name has the following specification: <mangled-name> ::= _Z <encoding> ::= <type> extension ::= ___Z <encoding> _block_invoke extension ::= ___Z <encoding> _block_invoke<decimal-digit>+ extension ::= ___Z <encoding> _block_invoke_<decimal-digit>+ In the issue #1641, the function name are "f" and "g", which is demangled to type "float" and "__float128", according to the above implementation. In bcc case, we only care about functions, so only do demangling for symbols starting with _Z or ___Z. Signed-off-by: Yonghong Song <yhs@fb.com>
@arssher I just pushed a pull request which should fix your issue. |
The issue @palmtenor pointed to is actually a real issue since in
For example, initially, |
Commit
Gives me the following:
Actual sleep is the last line, with 3000082 us. This stack contains a bunch of
Things become even more mad if I make the program only a little bit more complex by adding recursion to wasting cycles:
Now, running it with
While profiling with
Gives me, beside warning, the following (main, larger than 500 us) stacks:
|
As for |
@arssher I cannot reproduce the above issues for the below incorrect stack
Looks like the issue starts from resolving libc.so. Maybe libc is not compiled with frame pointer on so kernel unwinding has issue here?
This looks like offcputime.py tries to resolve user space stack and the user space process has exited. So it cannot even resolve the libc address. You could use the following hack to print out the user address to make sure it is correct.
|
Well, I have stepped with the debugger until
Here are the addresses of wrong stacktrace, printed as you hinted:
And here is the map of process memory space:
Let's see how
Note that
Again,
Here
But again it handles
Here a small piece of asm is executed:
It does not save & restore
(Disregard that here What we see in the walked stack trace instead? The first unknown symbol is
This is actually
is wrong, it should be |
Looks like it could be kernel related. I am not able to reproduce the issue with the latest net-next. |
Spent quite some time to debug but did not find the root cause.
Could this be related to CONFIG_PREEMPT? I did not have CONFIG_PREEMPT but still can reproeuce the issue earlier. @arssher could you share your kernel config if that can make it easy to reproduce the issue? |
Sure, here it is: |
@arssher , through some debugging. I am able to root cause the issue. The file is By studying the code, I found one workaround, enable kernel tracepoint
This will add TIF_SYSCALL_TRACEPOINT to every task in the system and effectively force slowpath for syscall processing in The issue has been fixed in upcoming 4.16 release and backported to stable release 4.14 and 4.15. So if you upgrade your kernel version from 4.14.13 to some later 4.14.x version, the problem will get fixed. At the same time, I am studying whether we have easy way to fix the older kernels (4.13 or 4.9), or we could request the same patch (as in 4.14/4.15 stable release) back ported to 4.13/4.9. |
Looks like The config you provided has For example, I change the audit rule from
to
and the user space will be correct. |
The issue comes up when I investigated issue #1641. A func symbol defined in assembly code will be size of 0, e.g., http://git.musl-libc.org/cgit/musl/tree/src/thread/x86_64/syscall_cp.s symbol __cp_begin. .text .global __cp_begin .hidden __cp_begin ... __syscall_cp_asm: __cp_begin: mov (%rdi),%eax test %eax,%eax and __cp_begin cannot be traced through bcc since symbol resolution rejects any func symbol with size 0. This patch removed size-must-not-zero restriction so that the symbol like __cp_begin can be traced. Command line: trace.py -p <pid> -U '<binary_path>:__cp_begin' Signed-off-by: Yonghong Song <yhs@fb.com>
Sorry for the delay. Indeed, I still don't quite understand where exactly
What should I have done to investigate this myself? Thank you. Should I close the issue or you will do that once solution is found for older kernels? |
The related source code in 4.14.13, arch/x86/entry/entry_64.S.
Basically, when the syscall goes to fastpath, bp register is not saved So rbp is not clobbered, but merely not saved so its value still points to somewhere I think you can close the issue. For old long term supported kernels, 4.9, 4.13, 4.15, |
Fix issue iovisor#1641 The bcc user space stack is not printed out properly. From https://en.wikipedia.org/wiki/Name_mangling, all mangled symbols begin with _Z, and an identifier beginning with an underscore followed by a capital is a reserved identifier in C, so conflict with user identifiers is avoided. Further, from the llvm demangle code https://github.com/llvm-mirror/libcxxabi/blob/master/src/cxa_demangle.cpp The demangled name has the following specification: <mangled-name> ::= _Z <encoding> ::= <type> extension ::= ___Z <encoding> _block_invoke extension ::= ___Z <encoding> _block_invoke<decimal-digit>+ extension ::= ___Z <encoding> _block_invoke_<decimal-digit>+ In the issue iovisor#1641, the function name are "f" and "g", which is demangled to type "float" and "__float128", according to the above implementation. In bcc case, we only care about functions, so only do demangling for symbols starting with _Z or ___Z. Signed-off-by: Yonghong Song <yhs@fb.com>
The issue comes up when I investigated issue iovisor#1641. A func symbol defined in assembly code will be size of 0, e.g., http://git.musl-libc.org/cgit/musl/tree/src/thread/x86_64/syscall_cp.s symbol __cp_begin. .text .global __cp_begin .hidden __cp_begin ... __syscall_cp_asm: __cp_begin: mov (%rdi),%eax test %eax,%eax and __cp_begin cannot be traced through bcc since symbol resolution rejects any func symbol with size 0. This patch removed size-must-not-zero restriction so that the symbol like __cp_begin can be traced. Command line: trace.py -p <pid> -U '<binary_path>:__cp_begin' Signed-off-by: Yonghong Song <yhs@fb.com>
Fix issue iovisor#1641 The bcc user space stack is not printed out properly. From https://en.wikipedia.org/wiki/Name_mangling, all mangled symbols begin with _Z, and an identifier beginning with an underscore followed by a capital is a reserved identifier in C, so conflict with user identifiers is avoided. Further, from the llvm demangle code https://github.com/llvm-mirror/libcxxabi/blob/master/src/cxa_demangle.cpp The demangled name has the following specification: <mangled-name> ::= _Z <encoding> ::= <type> extension ::= ___Z <encoding> _block_invoke extension ::= ___Z <encoding> _block_invoke<decimal-digit>+ extension ::= ___Z <encoding> _block_invoke_<decimal-digit>+ In the issue iovisor#1641, the function name are "f" and "g", which is demangled to type "float" and "__float128", according to the above implementation. In bcc case, we only care about functions, so only do demangling for symbols starting with _Z or ___Z. Signed-off-by: Yonghong Song <yhs@fb.com>
The issue comes up when I investigated issue iovisor#1641. A func symbol defined in assembly code will be size of 0, e.g., http://git.musl-libc.org/cgit/musl/tree/src/thread/x86_64/syscall_cp.s symbol __cp_begin. .text .global __cp_begin .hidden __cp_begin ... __syscall_cp_asm: __cp_begin: mov (%rdi),%eax test %eax,%eax and __cp_begin cannot be traced through bcc since symbol resolution rejects any func symbol with size 0. This patch removed size-must-not-zero restriction so that the symbol like __cp_begin can be traced. Command line: trace.py -p <pid> -U '<binary_path>:__cp_begin' Signed-off-by: Yonghong Song <yhs@fb.com>
Hi,
I'm experimenting with tracing postgres using offcputime.py. However, userspace stack is not properly unwinded in most samples as shown in attached flamegraph [1], making the tool basically unusable. Postgres is compiled with
-fno-omit-frame-pointer
(and this is confirmed a bunch of correctly unwinded stacks). I thought that one reason for this might be glibc compiled with-omit-frame-pointer
, because the last resolved symbol isepoll_pwait
from libc.so. I took musl libc and compiled it with-fno-omit-frame-pointer
, but the result is still the same. What can be other reasons for this and is there anything I can do about it?Kernel version is 4.14.13.
[1] https://oc.postgrespro.ru/index.php/s/qDINPrRnlNqxXBd
The text was updated successfully, but these errors were encountered: