Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Falco 0.36 and later segfaults on startup when metrics are enabled #2850

Closed
stevenbrz opened this issue Oct 3, 2023 · 10 comments · Fixed by #2851
Closed

Falco 0.36 and later segfaults on startup when metrics are enabled #2850

stevenbrz opened this issue Oct 3, 2023 · 10 comments · Fixed by #2851
Assignees
Labels
Milestone

Comments

@stevenbrz
Copy link

Describe the bug

Falco fails to start with a segfault when the metrics option is enabled in the config. I tried disabling the sub-options and it doesn't make a difference.

How to reproduce it

  • Install Falco 0.36.0 or later
  • Enable metrics in the config and run Falco

Example output:

# falco --modern-bpf -o log_level=debug -o log_stderr=true -o libs_logger.enabled=true -o libs_logger.severity=debug
Tue Oct  3 14:32:28 2023: Falco version: 0.37.0-55+29d2406 (x86_64)
Tue Oct  3 14:32:28 2023: CLI args: falco --modern-bpf -o log_level=debug -o log_stderr=true -o libs_logger.enabled=true -o libs_logger.severity=debug
Tue Oct  3 14:32:28 2023: Falco initialized with configuration file: /etc/falco/falco.yaml
Tue Oct  3 14:32:28 2023: Configured rules filenames:
Tue Oct  3 14:32:28 2023:    /etc/falco/falco_rules.yaml
Tue Oct  3 14:32:28 2023:    /etc/falco/falco_rules.local.yaml
Tue Oct  3 14:32:28 2023:    /etc/falco/rules.d
Tue Oct  3 14:32:28 2023: Loading rules from file /etc/falco/falco_rules.yaml
Tue Oct  3 14:32:28 2023: Loading rules from file /etc/falco/falco_rules.local.yaml
Tue Oct  3 14:32:28 2023: Watching file '/etc/falco/falco.yaml'
Tue Oct  3 14:32:28 2023: Watching file '/etc/falco/falco_rules.yaml'
Tue Oct  3 14:32:28 2023: Watching file '/etc/falco/falco_rules.local.yaml'
Tue Oct  3 14:32:28 2023: Watching directory '/etc/falco/rules.d'
Tue Oct  3 14:32:28 2023: Setting metadata download max size to 100 MB
Tue Oct  3 14:32:28 2023: Setting metadata download chunk wait time to 1000 μs
Tue Oct  3 14:32:28 2023: Setting metadata download watch frequency to 1 seconds
Tue Oct  3 14:32:28 2023: (19) syscalls in rules: connect, dup, dup2, dup3, execve, execveat, finit_module, init_module, link, linkat, open, openat, openat2, ptrace, sendmsg, sendto, socket, symlink, symlinkat
Tue Oct  3 14:32:28 2023: +(49) syscalls (Falco's state engine set of syscalls): accept, accept4, bind, capset, chdir, chroot, clone, clone3, close, creat, epoll_create, epoll_create1, eventfd, eventfd2, fchdir, fcntl, fork, getsockopt, inotify_init, inotify_init1, io_uring_setup, memfd_create, mount, open_by_handle_at, pidfd_getfd, pidfd_open, pipe, pipe2, prctl, prlimit, procexit, recvfrom, recvmsg, setgid, setpgid, setresgid, setresuid, setrlimit, setsid, setuid, shutdown, signalfd, signalfd4, socketpair, timerfd_create, umount, umount2, userfaultfd, vfork
Tue Oct  3 14:32:28 2023: (68) syscalls selected in total (final set): accept, accept4, bind, capset, chdir, chroot, clone, clone3, close, connect, creat, dup, dup2, dup3, epoll_create, epoll_create1, eventfd, eventfd2, execve, execveat, fchdir, fcntl, finit_module, fork, getsockopt, init_module, inotify_init, inotify_init1, io_uring_setup, link, linkat, memfd_create, mount, open, open_by_handle_at, openat, openat2, pidfd_getfd, pidfd_open, pipe, pipe2, prctl, prlimit, procexit, ptrace, recvfrom, recvmsg, sendmsg, sendto, setgid, setpgid, setresgid, setresuid, setrlimit, setsid, setuid, shutdown, signalfd, signalfd4, socket, socketpair, symlink, symlinkat, timerfd_create, umount, umount2, userfaultfd, vfork
Tue Oct  3 14:32:28 2023: The chosen syscall buffer dimension is: 8388608 bytes (8 MBs)
Tue Oct  3 14:32:28 2023: Setting metrics interval to 1h, equivalent to 3600000 (ms)
Segmentation fault (core dumped)

Expected behaviour

Falco doesn't crash. This behavior began when upgrading from 0.35.1 to 0.36.0. This also occurs on the most recent dev build of 0.37.

Environment

  • Falco version: 0.36.0 through 0.37.0-55+29d2406 (most recent dev build)
  • System info: Architecture: x86_64
  • Cloud provider or hardware configuration: AWS
  • OS: CentOS Stream
  • Kernel: 5.15.86
  • Installation method: RPM

Additional context

Core dump stacktrace:

Message: Process 3088612 (falco) of user 0 dumped core.

                Stack trace of thread 3088612:
                #0  0x00007fb5e334f5c2 timer_delete@@GLIBC_2.3.3 (librt.so.1)
                #1  0x000000000057eff6 n/a (falco)
                #2  0x0000000000530f71 n/a (falco)
                #3  0x00000000004f1402 n/a (falco)
                #4  0x00000000004efe4a n/a (falco)
                #5  0x00000000004f1397 n/a (falco)
                #6  0x00000000004aa8a6 n/a (falco)
                #7  0x00007fb5e2289d85 __libc_start_main (libc.so.6)
                #8  0x00000000004bb555 n/a (falco)

                Stack trace of thread 3088616:
                #0  0x00007fb5e22889bd syscall (libc.so.6)
                #1  0x000000000106d395 n/a (falco)
                #2  0x00000000005778e4 n/a (falco)
                #3  0x0000000000570eff n/a (falco)
                #4  0x00000000012c8910 n/a (falco)
                #5  0x00007fb5e355b1ca start_thread (libpthread.so.0)
                #6  0x00007fb5e2288e73 __clone (libc.so.6)

                Stack trace of thread 3088617:
                #0  0x00007fb5e3565180 __nanosleep (libpthread.so.0)
                #1  0x0000000000575cc3 n/a (falco)
                #2  0x00000000012c8910 n/a (falco)
                #3  0x00007fb5e355b1ca start_thread (libpthread.so.0)
                #4  0x00007fb5e2288e73 __clone (libc.so.6)

                Stack trace of thread 3088618:
                #0  0x00007fb5e237567f __select (libc.so.6)
                #1  0x00000000005cf956 n/a (falco)
                #2  0x00000000012c8910 n/a (falco)
                #3  0x00007fb5e355b1ca start_thread (libpthread.so.0)
                #4  0x00007fb5e2288e73 __clone (libc.so.6)

                Stack trace of thread 3088619:
                #0  0x00007fb5e22889bd syscall (libc.so.6)
                #1  0x000000000106d395 n/a (falco)
                #2  0x0000000000584ad7 n/a (falco)
                #3  0x00000000012c8910 n/a (falco)
                #4  0x00007fb5e355b1ca start_thread (libpthread.so.0)
                #5  0x00007fb5e2288e73 __clone (libc.so.6)

Please let me know if you need anymore information.

@FedeDP
Copy link
Contributor

FedeDP commented Oct 3, 2023

Hi! Thanks for opening this issue and thanks to dig it in depth. We will look into this asap!

I cannot reproduce it locally on my machine unfortunately (using 0.36.0 Falco tar.gz); i think it might depend on the glibc version (#0 0x00007fb5e334f5c2 timer_delete@@GLIBC_2.3.3 (librt.so.1); let's see if we come up with some ideas!

@FedeDP
Copy link
Contributor

FedeDP commented Oct 3, 2023

Would you be open to test a small patch if i send you the diff?

@stevenbrz
Copy link
Author

Would you be open to test a small patch if i send you the diff?

Sure thing!

@FedeDP
Copy link
Contributor

FedeDP commented Oct 3, 2023

I think you are being hit by: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1940296
Here is a patch to test:
patch.txt

@stevenbrz
Copy link
Author

This patch seemed to do the trick, thank you for such a quick response!

@FedeDP
Copy link
Contributor

FedeDP commented Oct 3, 2023

Yay! I'll open the PR tomorrow then!
Thank you very much for being super responsive in reporting the bug and testing the fix!

@incertum
Copy link
Contributor

incertum commented Oct 4, 2023

Hi @stevenbrz sorry missed this issue when I also worked on it yesterday simultaneously. We will likely release a patch release after assessing what else could need an urgent fix.

Thank you!

@leogr
Copy link
Member

leogr commented Oct 6, 2023

/milestone 0.36.1

@poiana poiana modified the milestones: 0.37.0, 0.36.1 Oct 6, 2023
@Andreagit97
Copy link
Member

@stevenbrz Falco 0.36.1-rc1 is out if you want to give it a try! Let us know if it solves your issue

@Andreagit97
Copy link
Member

Falco 0.36.1 is out! It should solve the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment