Falco pods crashing with "free(): corrupted unsorted chunks" #1656

ecology-chris · 2021-05-18T04:56:29Z

Pods are scheduled, star up, then suddenly crash with error: free(): corrupted unsorted chunks

This is currently only happening in just one cluster.

Expected behaviour

We have the same falco chart installed on a similar cluster with no issues.

Environment

Falco version: 0.27.0 (helm chart 1.7.10)
Cloud provider or hardware configuration:EKS
OS/Kernel: falco_amazonlinux2_4.14.219-161.340.amzn2.x86_64_1.ko
Installation method:helm

This would seem to be a problem with the one particular cluster, but the error isn't giving us much to work with. Any advice or guidance is appreciated.

leogr · 2021-05-18T11:17:19Z

Have you tried 0.28.1 ?

Could you also provide the relevant part of the log, please?

ecology-chris · 2021-05-19T02:10:46Z

I just installed 0.28.1 with helm. I see the driver load, the web server come up, a few findings on different containers, then an error like this before the enter a crash backoff loop and try to restart.

02:04:13.350191969: Debug Falco internal: syscall event drop. 66960 system calls dropped in last second. (ebpf_enabled=0 n_drops=66960 n_drops_buffer=66960 n_drops_bug=0 n_drops_pf=0 n_evts=281105) free(): corrupted unsorted chunks

poiana · 2021-08-17T02:41:50Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

leogr · 2021-08-18T08:42:15Z

Hey @ecology-chris

sorry for the late reply. Anyways, I was not able to reproduce this issue. Could you provide more details or a reproducible setup?

poiana · 2021-09-17T14:24:14Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

jcdecaux-oss · 2021-09-20T16:02:52Z

Same here only when k8s audit logs is enabled.

Environment

Falco version: 0.29.1 (helm chart 1.15.7)
Cloud provider or hardware configuration: EKS v1.18.9-eks-d1db3c
OS/Kernel: falco_amazonlinux2_4.14.238-182.421.amzn2.x86_64_1.ko
Installation method: helm

$ k logs falco-4t6ln

Setting up /usr/src links from host
Running falco-driver-loader for: falco version=0.29.1, driver version=17f5df52a7d9ed6bb12d3b1768460def8439936d
Running falco-driver-loader with: driver=module, compile=yes, download=yes
Unloading falco module, if present
Trying to load a system falco module, if present
Looking for a falco module locally (kernel 4.14.238-182.421.amzn2.x86_64)
Trying to download a prebuilt falco module from https://download.falco.org/driver/17f5df52a7d9ed6bb12d3b1768460def8439936d/falco_amazonlinux2_4.14.238-182.421.amzn2.x86_64_1.ko
Download succeeded
Success: falco module found and inserted
Mon Sep 20 15:55:06 2021: Falco version 0.29.1 (driver version 17f5df52a7d9ed6bb12d3b1768460def8439936d)
Mon Sep 20 15:55:06 2021: Falco initialized with configuration file /etc/falco/falco.yaml
Mon Sep 20 15:55:06 2021: Loading rules from file /etc/falco/falco_rules.yaml:
Mon Sep 20 15:55:06 2021: Loading rules from file /etc/falco/falco_rules.local.yaml:
Mon Sep 20 15:55:06 2021: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
Mon Sep 20 15:55:06 2021: Loading rules from file /etc/falco/rules.d/custom-lists.yaml:
Mon Sep 20 15:55:06 2021: Loading rules from file /etc/falco/rules.d/custom-macros.yaml:
Mon Sep 20 15:55:06 2021: Loading rules from file /etc/falco/rules.d/custom-rules.yaml:
Mon Sep 20 15:55:07 2021: Starting internal webserver, listening on port 8765
free(): corrupted unsorted chunks

/var/log/messages

Sep 20 15:54:15 ip-10-235-221-131 kernel: traps: falco[28701] general protection ip:7f859b294611 sp:7ffd0db3d650 error:0 in libc-2.28.so[7f859b294000+148000]
Sep 20 15:54:15 ip-10-235-221-131 kernel: falco: deallocating consumer ffff888aff4f0000
Sep 20 15:54:15 ip-10-235-221-131 kernel: falco: no more consumers, stopping capture

FedeDP · 2021-09-22T13:26:56Z

Hi @jcdecaux-oss !
Are you able to share a coredump? (if your node runs systemd, coredumpctl can help!)
Thank you very much for your efforts :)

poiana · 2021-10-22T14:43:10Z

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

poiana · 2021-10-22T14:43:13Z

@poiana: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

lzaldivarkt · 2022-07-23T16:30:16Z

/reopen
Hi @FedeDP, I'm seeing the same issue on a high activity cluster, I managed to get a coredump, you can find it here:
https://motive-shared-public-files.s3.amazonaws.com/falco.coredump.gz
This doesn't happen on any other cluster with the same configuration. And we consistently can reproduce it, it's currently happening in half of this cluster nodes.

Specs:

Running
Version: falco 0.30 on kubernetes 1.20 (deployed via kOps)
OS: Debian GNU/Linux 9.13 (stretch)
Kernel: Linux ip-10-0-97-98 4.9.0-14-amd64 #1 SMP Debian 4.9.246-2 (2020-12-17) x86_64 GNU/Linux

poiana · 2022-07-23T16:30:18Z

@lzaldivarkt: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen
Hi @FedeDP, I'm seeing the same issue on a high activity cluster, I managed to get a coredump, you can find it here:
https://motive-shared-public-files.s3.amazonaws.com/falco.coredump.gz
This doesn't happen on any other cluster with the same configuration. And we consistently can reproduce it, it's currently happening in half of this cluster nodes.

Specs:

Running
Version: falco 0.30 on kubernetes 1.20 (deployed via kOps)
OS: Debian GNU/Linux 9.13 (stretch)
Kernel: Linux ip-10-0-97-98 4.9.0-14-amd64 #1 SMP Debian 4.9.246-2 (2020-12-17) x86_64 GNU/Linux

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

lzaldivarkt · 2022-07-23T16:31:21Z

Welp, I can't reopen issues, I'll open a new one.

ecology-chris added the kind/bug label May 18, 2021

poiana added the lifecycle/stale label Aug 17, 2021

poiana added lifecycle/rotten and removed lifecycle/stale labels Sep 17, 2021

poiana closed this as completed Oct 22, 2021

lzaldivarkt mentioned this issue Jul 23, 2022

Falco pods crashing with "free(): corrupted unsorted chunks" #2144

Closed

MikeCockrem mentioned this issue Jan 22, 2024

Does Falco support running via K3S on CM4? Pods error out. #3027

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Falco pods crashing with "free(): corrupted unsorted chunks" #1656

Falco pods crashing with "free(): corrupted unsorted chunks" #1656

ecology-chris commented May 18, 2021

leogr commented May 18, 2021 •

edited

Loading

ecology-chris commented May 19, 2021

poiana commented Aug 17, 2021

leogr commented Aug 18, 2021

poiana commented Sep 17, 2021

jcdecaux-oss commented Sep 20, 2021

FedeDP commented Sep 22, 2021 •

edited

Loading

poiana commented Oct 22, 2021

poiana commented Oct 22, 2021

lzaldivarkt commented Jul 23, 2022

poiana commented Jul 23, 2022

lzaldivarkt commented Jul 23, 2022

Falco pods crashing with "free(): corrupted unsorted chunks" #1656

Falco pods crashing with "free(): corrupted unsorted chunks" #1656

Comments

ecology-chris commented May 18, 2021

leogr commented May 18, 2021 • edited Loading

ecology-chris commented May 19, 2021

poiana commented Aug 17, 2021

leogr commented Aug 18, 2021

poiana commented Sep 17, 2021

jcdecaux-oss commented Sep 20, 2021

FedeDP commented Sep 22, 2021 • edited Loading

poiana commented Oct 22, 2021

poiana commented Oct 22, 2021

lzaldivarkt commented Jul 23, 2022

poiana commented Jul 23, 2022

lzaldivarkt commented Jul 23, 2022

leogr commented May 18, 2021 •

edited

Loading

FedeDP commented Sep 22, 2021 •

edited

Loading