New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Falco won't start on 0.36 - /sys/devices/system/cpu/cpu8/online: No such file or directory #2843
Comments
Hi! Thanks for reporting this issue. |
Thanks for picking it up @FedeDP |
Can you share number of cpus of your node? By running |
Also, mind to share full Falco output? Thank you! |
That error came from a node with 8 cores (vcores). Log looks like this:
|
Thanks! Can you also share output of I mean, if the node has 8 cores, we should go from 0 to 7; moreover, the code has not been touched between Falco 0.35 and 0.36. |
It's a virtual machine with CPU hotplug enabled, could that have something to do with the possible value being at 127? |
Yep, that is a really weird value (i mean |
Nothing I'm aware of (homelab, it's all under my control), Falco is just kept up to date via Flux and as soon as the update was installed yesterday it went into the crash loop. |
That would be great, thank you! If, in the same env, Falco 0.35 worked fine, we got a real bug :) |
Btw here is the blame: https://github.com/falcosecurity/libs/blame/762c23b98bd5bcdc5d680939d5b44cc2d92fb850/userspace/libscap/engine/bpf/scap_bpf.c#L1543 As you can see, the code hasn't been touched for months now. |
Rolled back to chart version 3.6.2 and everything is running happily... |
Very interesting, since the code is the same in both versions! |
Also, are you still using eBPF now? |
Okay, my previous output was from the node, not the container... Possible is the same in the working deployment (inside the container). And yes, eBPF still. Modern eBPF won't start with a different error.
I tried using the legacy driver loader and it's still not happy. |
I don't understand, so the real issue is:
|
Yes, when using modern eBPF When using old eBPF, the error is
|
So, 0.35.1 was working fine, while 0.36.0 is broken.
outputs on the node where Falco is running? Again, that code has not been touched in this release cycle: https://github.com/falcosecurity/libs/blame/master/userspace/libscap/engine/bpf/scap_bpf.c#L1945 |
I think so; it seems like a way for the vm to allow increasing number of online cpus (ie: CPUs made available to the vm) without the need to reboot. I think Falco is not able to correctly manage this situation at the moment. |
Here it is
It is not starting on any node, and they've not had any core count changed lately. |
The 2
Before, you said that
returned 0-127; i'd expect |
🤯 At least if it didn't work in the previous version I'd suspect something else, kernel upgrade for example. Let me know what else I could try. I'm just rolling back to the previous version for now |
Exactly, that's so weird. (Fact is, i know what needs to be fixed, but until we actually understand why it was working on 0.35, i won't push any PR!) |
The patch is ready: falcosecurity/libs#1373 |
It's the same - cat /sys/devices/system/cpu/possible is 0-127 🤯 |
Got a build/image I can try? |
/milestone 0.36.1 |
@tks98 @mateuszdrab Falco 0.36.1-rc1 is out if you want to give it a try! Let us know if it solves your issue |
Hey @Andreagit97 Thank you for letting me know Using the latest chart with substituted image tag, I get the below when starting: falco Mon Oct 16 08:28:10 2023: Falco version: 0.36.1-rc1 (x86_64) |
It seems strange since we didn't touch the drivers at all, I will take a look 👀 thank you for reporting! |
@mateuszdrab it seems like Falco is trying to use an old driver version, could you try a fresh installation with diff --git a/falco/values.yaml b/falco/values.yaml
index bbf6a5f..5e5980d 100644
--- a/falco/values.yaml
+++ b/falco/values.yaml
@@ -12,7 +12,7 @@ image:
# -- The image repository to pull from
repository: falcosecurity/falco-no-driver
# -- The image tag to pull. Overrides the image tag whose default is the chart appVersion.
- tag: ""
+ tag: "0.36.1-rc1"
# -- Secrets containing credentials when pulling from private/secure registries.
imagePullSecrets: []
@@ -179,7 +179,7 @@ driver:
# Always set it to false when using Falco with plugins.
enabled: true
# -- Tell Falco which driver to use. Available options: module (kernel driver), ebpf (eBPF probe), modern-bpf (modern eBPF probe).
- kind: module
+ kind: ebpf
# -- Configuration section for ebpf driver.
ebpf:
# -- Path where the eBPF probe is located. It comes handy when the probe have been installed in the nodes using tools other than the init
@@ -215,7 +215,7 @@ driver:
# -- The image repository to pull from.
repository: falcosecurity/falco-driver-loader
# -- Overrides the image tag whose default is the chart appVersion.
- tag: ""
+ tag: "0.36.1-rc1"
# -- Extra environment variables that will be pass onto Falco driver loader init container.
env: []
# -- Arguments to pass to the Falco driver loader init container. If I remember well you were using Falco 0.35.1, is it possible that you are running Falco 0.36.1 with the ebpf probe of Falco 0.35.1? |
Wohoo, it's running... forgot to override the driver installer image
|
Thank you very much for testing it! |
You're welcome, thanks for fixing the issue so quick |
Falco 0.36.1 is out! It should solve the issue. I will close it, feel free to reopen it if you face other issues |
Awesome, I'll remove the image override from the Flux HelmRelease when I get home and report back if there's any issues. |
Hi I am facing same issue I have install falco through helm |
ei @ShaikhMJAM which Falco version are you using? |
I am using 0.36.2 version |
uhm interesting, could you provide the full Falco logs with the error? |
$ kubectl logs ds/falco -n falco |
uhm this seems not related to this issue, but it is related to this one #2792 |
$ uname -r |
yes apart from the patch version is the same kernel version of the other issue... The best solution here is not to use the modern bpf at all. We don't know which helpers are backported or not... the suggestion here is to switch to a recent kernel version ASAP or to use one of the other 2 drivers (legacy_bpf,kernel module) if possible |
ok, Can you help me with changing driver, How we change the driver |
if you are using the official falco helm chart to deploy falco you need to update the # Driver settings (scenario requirement)
driver:
# -- Set it to false if you want to deploy Falco without the drivers.
# Always set it to false when using Falco with plugins.
enabled: true
# -- Tell Falco which driver to use. Available options: module (kernel driver), ebpf (eBPF probe), modern-bpf (modern eBPF probe).
kind: ebpf In this example I've used the legacy ebpf probe |
Hey I have reinstall falco by using this command "helm install falco falcosecurity/falco --set driver.kind=legacy_bpf" but still it is not woking |
the right command is |
no its not working |
could you log the error please? |
[root@kubernet-master-09 ~]# kubectl logs ds/falco -n falco |
uhm another verifier error... what about the kernel module |
No its not working but now it is giving different error |
Describe the bug
Since upgrading to 0.36 today - Falco won't start with the following:
/sys/devices/system/cpu/cpu8/online: No such file or directory
How to reproduce it
Upgrade to release 0.36 from 0.35
Expected behaviour
Falco starts
Screenshots
N/A
Environment
0.36
5.4.0-163-generic #180-Ubuntu SMP Tue Sep 5 13:21:23 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Additional context
The text was updated successfully, but these errors were encountered: