Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Legacy eBPF] Falco doesn't work on GKE cluster(v1.24) #2874

Closed
Tracked by #2873
Andreagit97 opened this issue Oct 16, 2023 · 7 comments
Closed
Tracked by #2873

[Legacy eBPF] Falco doesn't work on GKE cluster(v1.24) #2874

Andreagit97 opened this issue Oct 16, 2023 · 7 comments
Assignees
Labels
Milestone

Comments

@Andreagit97
Copy link
Member

@EdikAndriasyan reposting your initial question #2869 (comment)

Hey, I am deploying Falco in GKE cluster(v1.24) with helm chart(3.7.1). Using ebpf module and deploying Falco as DaemonSet. I am getting this error in Falco logs.

`-- BEGIN PROG LOAD LOG --
processed 43798 insns (limit 1000000) max_states_per_insn 1 total_states 4061 peak_states 4061 mark_read 1921

-- END PROG LOAD LOG --
Mon Oct 16 09:06:37 2023: An error occurred in an event source, forcing termination...
Mon Oct 16 09:06:37 2023: Closing event source 'syscall'
Events detected: 0
Rule counts by severity:
Triggered rules by rule name:
Error: libscap: bpf_load_program() event=raw_tracepoint/filler/sys_procexit_e: Operation not permitted`

@Andreagit97
Copy link
Member Author

Could you be more specific about the GKE version?
In the stable channel, I actually see this versions

- channel: STABLE
  defaultVersion: 1.27.3-gke.100
  validVersions:
  - 1.27.4-gke.900
  - 1.27.3-gke.100
  - 1.26.7-gke.500
  - 1.26.5-gke.2700
  - 1.25.12-gke.500
  - 1.25.10-gke.2700
  - 1.24.16-gke.500
  - 1.24.15-gke.1700
  - 1.24.14-gke.2700

BTW apart from the version, the issue here is that bpf verifier doesn't pass on this kernel. I don't think we can fix this at the moment, if the kernel supports it, maybe a good idea could be to use the modern-bpf driver 🤔

If you really need to use the legacy ebpf probe probably the best solution at the moment is to bump the GKE version. For example, this morning I tried 1.27.3-gke.100 with the legacy probe and Falco 0.36.1 and it worked well

@Andreagit97 Andreagit97 added this to the TBD milestone Oct 16, 2023
@Andreagit97 Andreagit97 self-assigned this Oct 16, 2023
@EdikAndriasyan
Copy link

I guess the gke version is not the reason because I have tried to deploy it in anther cluster with GKE version v1.27.5-gke.200 and kernel version 5.15.120 and the same issue. But I have successfully deployed in third cluster with GKE version v1.25.5-gke.1500 and kernel version 5.15.65. Let me test with modern-bpf.

@Andreagit97
Copy link
Member Author

I guess the gke version is not the reason because I have tried to deploy it in anther cluster with GKE version v1.27.5-gke.200 and kernel version 5.15.120 and the same issue. But I have successfully deployed in third cluster with GKE version v1.25.5-gke.1500 and kernel version 5.15.65. Let me test with modern-bpf.

Uhm that's interesting yesterday I tried 1.27.3-gke.100 and it worked, so it seems only some versions are affected...Let us know if the modern probe works for you

@EdikAndriasyan
Copy link

EdikAndriasyan commented Oct 18, 2023

I have tried moder-bpf in v1.27.5-gke.200 GKE cluster with Kernel version5.15.120+ and got libpman: ring buffer map type is not supported (errno: 1 | message: Operation not permitted) error . I have used this options in helm chart values.

containerSecurityContext:
  capabilities:
    add:
    - CAP_SYS_RESOURCE
    - CAP_BPF
    - CAP_PERFMON
    - CAP_SYS_PTRACE
 
driver:
  enabled: true
  kind: modern-bpf

Also I have tired with ebpf in the same environment and got setrlimit failed: Operation not permitted error with the following values. I saw that there was an opened issue for this error. And as I understand the solution was to use CAP_SYS_ADMIN capability depends on kernel version

containerSecurityContext:
  capabilities:
    add:
    - CAP_SYS_RESOURCE
    - CAP_SYS_ADMIN
    - CAP_SYS_PTRACE

driver:
  enabled: true
  kind: ebpf
  ebpf:
    hostNetwork: true
    leastPrivileged: false

@Andreagit97
Copy link
Member Author

I have tried moder-bpf in v1.27.5-gke.200 GKE cluster with Kernel version5.15.120+ and got libpman: ring buffer map type is not supported (errno: 1 | message: Operation not permitted) error . I have used this options in helm chart values.

Uhm this is strange, I've tried your same environment and it works for me...
I created the GKE cluster in this way:

gcloud container clusters create falco-test-1 \
      --release-channel=rapid \
      --cluster-version=1.27.5-gke.200 \
      --zone=europe-west1-c \
      --image-type=cos_containerd \
      --machine-type=e2-medium

So v1.27.5-gke.200 GKE and Kernel version Linux 5.15.120+.
The helm chart I used is 3.8.0, this helm chart version ships Falco 0.36.1.
This is what I changed in the original values.yaml shipped in 3.8.0 to deploy the modern bpf

diff --git a/falco/values.yaml b/falco/values.yaml
index c6ed654..a3e7558 100644
--- a/falco/values.yaml
+++ b/falco/values.yaml
@@ -179,7 +179,7 @@ driver:
   # Always set it to false when using Falco with plugins.
   enabled: true
   # -- Tell Falco which driver to use. Available options: module (kernel driver), ebpf (eBPF probe), modern-bpf (modern eBPF probe).
-  kind: module
+  kind: modern-bpf
   # -- Configuration section for ebpf driver.
   ebpf:
     # -- Path where the eBPF probe is located. It comes handy when the probe have been installed in the nodes using tools other than the init
@@ -194,7 +194,7 @@ driver:
     # On kernel versions >= 5.8 'CAP_PERFMON' and 'CAP_BPF' could replace 'CAP_SYS_ADMIN' but please pay attention to the 'kernel.perf_event_paranoid' value on your system.
     # Usually 'kernel.perf_event_paranoid>2' means that you cannot use 'CAP_PERFMON' and you should fallback to 'CAP_SYS_ADMIN', but the behavior changes across different distros.
     # Read more on that here: https://falco.org/docs/event-sources/kernel/#least-privileged-mode-1
-    leastPrivileged: false
+    leastPrivileged: true
   # -- Configuration section for modern bpf driver.
   modern_bpf:
     # -- Constrain Falco with capabilities instead of running a privileged container.

It works also with leastPrivileged: false but since you tested with true I did the same thing

Also I have tired with ebpf in the same environment and got setrlimit failed: Operation not permitted error with the following values. I saw that there was an opened #2487 for this error. And as I understand the solution was to use CAP_SYS_ADMIN capability depends on kernel version

Yes exactly, the privileges required to run the legacy bpf probe really depend on the environment you use, you can find more info here https://falco.org/docs/event-sources/kernel/#least-privileged-mode-1

@EdikAndriasyan
Copy link

EdikAndriasyan commented Oct 19, 2023

I have upgraded chart version from 3.7.1 -> 3.8.0 and with modern-bpf it worked!
Thanks for your time!

@Andreagit97
Copy link
Member Author

Andreagit97 commented Oct 19, 2023

You are welcome! I will close this issue, feel free to reopen it if necessary!
As a pointer for everyone in the future, when the kernel supports it (usually >=5.8 but not always) try to use the modern-bpf probe instead of the legacy one

These are the necessary changes to run it with helm chart 3.8.0 and so Falco 0.36.1

diff --git a/falco/values.yaml b/falco/values.yaml
index c6ed654..a3e7558 100644
--- a/falco/values.yaml
+++ b/falco/values.yaml
@@ -179,7 +179,7 @@ driver:
   # Always set it to false when using Falco with plugins.
   enabled: true
   # -- Tell Falco which driver to use. Available options: module (kernel driver), ebpf (eBPF probe), modern-bpf (modern eBPF probe).
-  kind: module
+  kind: modern-bpf
   # -- Configuration section for ebpf driver.
   ebpf:
     # -- Path where the eBPF probe is located. It comes handy when the probe have been installed in the nodes using tools other than the init
@@ -194,7 +194,7 @@ driver:
     # On kernel versions >= 5.8 'CAP_PERFMON' and 'CAP_BPF' could replace 'CAP_SYS_ADMIN' but please pay attention to the 'kernel.perf_event_paranoid' value on your system.
     # Usually 'kernel.perf_event_paranoid>2' means that you cannot use 'CAP_PERFMON' and you should fallback to 'CAP_SYS_ADMIN', but the behavior changes across different distros.
     # Read more on that here: https://falco.org/docs/event-sources/kernel/#least-privileged-mode-1
-    leastPrivileged: false
+    leastPrivileged: true
   # -- Configuration section for modern bpf driver.
   modern_bpf:
     # -- Constrain Falco with capabilities instead of running a privileged container.

It's possible that the legacy probe won't work on all kernel versions but if the modern probe works we are good, the idea is to have at least one of the 2 working on the widest possible range of kernels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants