Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault after failing BPF_MAP_CREATE #515

Closed
jvnn opened this issue Apr 2, 2019 · 4 comments
Closed

Segfault after failing BPF_MAP_CREATE #515

jvnn opened this issue Apr 2, 2019 · 4 comments
Labels
bug Something isn't working
Milestone

Comments

@jvnn
Copy link

jvnn commented Apr 2, 2019

I ran into a segmentation fault after all my bpftrace calls suddenly started failing due to "Error creating map: @". It seems the failure doesn't stop the execution, and at some point later bpftrace runs into a segfault. I then tried to rebuild with debug symbols to analyze the crash a bit better, but suddenly I couldn't reproduce it any more. I also rebuilt the release binary, but that too continued to work without issues. Thus, below are all the notes that I managed to take before the recompilation, unfortunately I don't have a core dump. I don't know what provoked the initial error, but I guess the segfault afterwards (and the MAP_UPDATE_ELEM calls with an invalid fd) is at least something that can be taken care of.

strace of the crash:

> sudo strace -febpf bpftrace -e 'tracepoint:syscalls:sys_enter_* { @[probe] = count(); }'
strace: Process 22711 attached
[pid 22711] +++ exited with 0 +++
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_HASH, key_size=1974495776, value_size=8, max_entries=8, map_flags=0x80 /* BPF_F_??? */, inner_map_fd=0}, 112) = -1 EINVAL (Invalid argument)
Error creating map: '@'
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=1974495424, value_size=4, max_entries=4, map_flags=0x8 /* BPF_F_??? */, inner_map_fd=0}, 112) = -1 EINVAL (Invalid argument)
Error creating perf event map (-1)
Attaching 316 probes...
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=-1, key=0x7ffc75b06990, value=0x7ffc75b06994, flags=BPF_ANY}, 112) = -1 EBADF (Bad file descriptor)
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=-1, key=0x7ffc75b06990, value=0x7ffc75b06994, flags=BPF_ANY}, 112) = -1 EBADF (Bad file descriptor)
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=-1, key=0x7ffc75b06990, value=0x7ffc75b06994, flags=BPF_ANY}, 112) = -1 EBADF (Bad file descriptor)
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=-1, key=0x7ffc75b06990, value=0x7ffc75b06994, flags=BPF_ANY}, 112) = -1 EBADF (Bad file descriptor)
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=-1, key=0x7ffc75b06990, value=0x7ffc75b06994, flags=BPF_ANY}, 112) = -1 EBADF (Bad file descriptor)
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=-1, key=0x7ffc75b06990, value=0x7ffc75b06994, flags=BPF_ANY}, 112) = -1 EBADF (Bad file descriptor)
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=-1, key=0x7ffc75b06990, value=0x7ffc75b06994, flags=BPF_ANY}, 112) = -1 EBADF (Bad file descriptor)
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=-1, key=0x7ffc75b06990, value=0x7ffc75b06994, flags=BPF_ANY}, 112) = -1 EBADF (Bad file descriptor)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x5} ---
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_TKILL, si_pid=22710, si_uid=0} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault

debugging session with gdb:

Error creating map: '@'
Error creating perf event map (-1)
Attaching 316 probes...

Thread 1 "bpftrace" received signal SIGSEGV, Segmentation fault.
__strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
62	../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory.
(gdb) bt
#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
#1  0x00007ffff4a6f03e in __bpf_object__open_xattr () from /usr/lib/x86_64-linux-gnu/libbcc.so.0
#2  0x00007ffff4a7070c in bpf_prog_load_xattr () from /usr/lib/x86_64-linux-gnu/libbcc.so.0
#3  0x00007ffff4a70930 in bpf_prog_load () from /usr/lib/x86_64-linux-gnu/libbcc.so.0
#4  0x00005555556b8bb0 in bpftrace::AttachedProbe::load_prog() ()
#5  0x00005555556b765c in bpftrace::AttachedProbe::AttachedProbe(bpftrace::Probe&, std::tuple<unsigned char*, unsigned long>)
    ()
#6  0x000055555570292c in std::_MakeUniq<bpftrace::AttachedProbe>::__single_object std::make_unique<bpftrace::AttachedProbe, bpftrace::Probe&, std::tuple<unsigned char*, unsigned long> const&>(bpftrace::Probe&, std::tuple<unsigned char*, unsigned long> const&) ()
#7  0x00005555556ee249 in bpftrace::BPFtrace::attach_probe(bpftrace::Probe&, bpftrace::BpfOrc const&) ()
#8  0x00005555556ee699 in bpftrace::BPFtrace::run(std::unique_ptr<bpftrace::BpfOrc, std::default_delete<bpftrace::BpfOrc> >)
    ()
#9  0x000055555571f212 in main ()

(gdb) disass
Dump of assembler code for function __strlen_avx2:
   0x00007fffee5b6590 <+0>:	mov    ecx,edi
   0x00007fffee5b6592 <+2>:	mov    rdx,rdi
   0x00007fffee5b6595 <+5>:	vpxor  xmm0,xmm0,xmm0
   0x00007fffee5b6599 <+9>:	and    ecx,0x3f
   0x00007fffee5b659c <+12>:	cmp    ecx,0x20
   0x00007fffee5b659f <+15>:	ja     0x7fffee5b65c0 <__strlen_avx2+48>
=> 0x00007fffee5b65a1 <+17>:	vpcmpeqb ymm1,ymm0,YMMWORD PTR [rdi]

(gdb) info registers 
rax            0x1	1
rbx            0x7ffffffe4540	140737488241984
rcx            0x5	5
rdx            0x5	5
rsi            0x7ffff6fd62ba	140737337189050
rdi            0x5	5
rbp            0x0	0x0
rsp            0x7ffffffe44d8	0x7ffffffe44d8
r8             0x7ffff7fec630	140737354057264
r9             0x40f12	266002
r10            0x12	18
r11            0x7fffee210ee0	140737188531936
r12            0x5	5
r13            0x1	1
r14            0x0	0
r15            0x0	0
rip            0x7fffee5b65a1	0x7fffee5b65a1 <__strlen_avx2+17>
eflags         0x10283	[ CF SF IF RF ]
cs             0x33	51
ss             0x2b	43
ds             0x0	0
es             0x0	0
fs             0x0	0
gs             0x0	0

It seems that strlen gets called with a pointer to address "5". For further context (in case there are multiple strlens in the calling function), here's the caller:

   0x00007ffff4a6f026 <+86>:	mov    r12,QWORD PTR [rbx]
   0x00007ffff4a6f029 <+89>:	call   0x7ffff48a6b00 <elf_version@plt>
   0x00007ffff4a6f02e <+94>:	test   eax,eax
   0x00007ffff4a6f030 <+96>:	je     0x7ffff4a6f128 <__bpf_object__open_xattr+344>
   0x00007ffff4a6f036 <+102>:	mov    rdi,r12
   0x00007ffff4a6f039 <+105>:	call   0x7ffff489d620 <strlen@plt>
=> 0x00007ffff4a6f03e <+110>:	lea    rsi,[rax+0x131]

As I said, those are the notes I made before rebuilding and before not being able to reproduce the issue anymore. Feel free to ask questions, but I won't be able to provide more details about the stack itself.

@brendangregg
Copy link
Contributor

I've run into something similar when mixing iovisor's packaged libbcc with Debian/Canonical's packaged bpftrace (which has been built to use Debian/Canoncial's packaged libbpfcc, and not iovisor version). The fix was to either use Debian/Canonical's libbpfcc library instead, or to build bpftrace locally with the iovisor libbcc. Can you say where you got both bpftrace and libbcc/libbpfcc?

@jvnn
Copy link
Author

jvnn commented Apr 3, 2019

I built both from source, bcc using tag v0.8.0 (because of build problems in master) and bpftrace using the latest code (roughly two weeks ago I guess). But I was experimenting with a docker container at the time, which had its own version of both packages but built at a different time. The container is also using v0.8.0 of bcc, but used whatever state was current for bpftrace at build time. The crashing happened only with the bpftrace on my host and with a version that had worked correctly before the container tests, so perhaps the different binary in the container caused some changes in the (shared) debug filesystem...? I'm not familiar with debugfs, so no clue if something like that can happen.

@jvnn
Copy link
Author

jvnn commented Apr 3, 2019

I just tried again on the host, and I can reproduce the crash. This time both the container and my host version of bpftrace are built with tag v0.9 (bcc still at v0.8.0). I'll leave it like this for now, please let me know if you'd like to get some more info from the current crashing release binary. Otherwise I can build a debug version and try to provoke the crash again somehow.

@mmarchini mmarchini added the bug Something isn't working label Apr 15, 2019
@mmarchini mmarchini added this to the 1.0 milestone Apr 15, 2019
@fbs
Copy link
Contributor

fbs commented Jan 18, 2020

We now terminate when a map cannot be created, which should prevent this segv

Error creating map: '@': Operation not permitted
Error creating printf map: Operation not permitted
Creation of the required BPF maps has failed.
Make sure you have all the required permissions and are not confined (e.g. like
snapcraft does). `dmesg` will likely have useful output for further troubleshooting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants