Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Falco crash at startup when using ebpf driver #1761

Closed
jasiam opened this issue Oct 21, 2021 · 17 comments
Closed

Falco crash at startup when using ebpf driver #1761

jasiam opened this issue Oct 21, 2021 · 17 comments
Labels

Comments

@jasiam
Copy link

jasiam commented Oct 21, 2021

Describe the bug

I'm migrating from kernel-module driver to ebpf driver (I compile it and mount the driver into falco pods) but falco crashes when starting up on ebpf mode.

How to reproduce it

Compile the ebpf driver using falco-builder image:

cmake -DBUILD_BPF=ON -DUSE_BUNDLED_DEPS=ON ..

make bpf

Then start a pod with falco 0.29.1 mounting the generated probe.o (even though doc in https://falco.org/docs/getting-started/source/#enable-bpf-support shows it should be falco.o) file into falco pod as /root/.falco/falco_centos_4.18.0-305.12.1.el8_4.x86_64_1.o

Expected behaviour

No crash

Screenshots

* Setting up /usr/src links from host
* Running falco-driver-loader for: falco version=0.29.1, driver version=17f5df52a7d9ed6bb12d3b1768460def8439936d
* Running falco-driver-loader with: driver=bpf, compile=yes, download=yes
* Mounting debugfs
* Skipping download, eBPF probe is already present in /root/.falco/falco_centos_4.18.0-305.12.1.el8_4.x86_64_1.o
* Skipping compilation, eBPF probe is already present in /root/.falco/falco_centos_4.18.0-305.12.1.el8_4.x86_64_1.o
* eBPF probe located in /root/.falco/falco_centos_4.18.0-305.12.1.el8_4.x86_64_1.o
* Success: eBPF probe symlinked to /root/.falco/falco-bpf.o
Thu Oct 21 07:54:16 2021: Falco version 0.29.1 (driver version 17f5df52a7d9ed6bb12d3b1768460def8439936d)
Thu Oct 21 07:54:16 2021: Falco initialized with configuration file /etc/falco/falco.yaml
Thu Oct 21 07:54:16 2021: Loading rules from file /etc/falco/falco_rules.yaml:
Thu Oct 21 07:54:17 2021: Loading rules from file /etc/falco/falco_rules.local.yaml:
Thu Oct 21 07:54:17 2021: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
0: (85) call bpf_get_smp_processor_id#8
1: (63) *(u32 *)(r10 -4) = r0
2: (bf) r2 = r10
3: (07) r2 += -4
4: (18) r1 = 0xffff9bb7ac3fdc00
6: (85) call bpf_map_lookup_elem#1
7: (15) if r0 == 0x0 goto pc+67
 R0_w=map_value(id=0,off=0,ks=4,vs=77,imm=0) R10=fp0 fp-8=mmmm????
8: (71) r1 = *(u8 *)(r0 +37)
 R0_w=map_value(id=0,off=0,ks=4,vs=77,imm=0) R10=fp0 fp-8=mmmm????
9: (67) r1 <<= 8
10: (71) r2 = *(u8 *)(r0 +36)
 R0_w=map_value(id=0,off=0,ks=4,vs=77,imm=0) R1_w=inv(id=0,umax_value=65280,var_off=(0x0; 0xff00)) R10=fp0 fp-8=mmmm????
11: (4f) r1 |= r2
12: (71) r3 = *(u8 *)(r0 +38)
 R0_w=map_value(id=0,off=0,ks=4,vs=77,imm=0) R1_w=inv(id=0) R2_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R10=fp0 fp-8=mmmm????
13: (71) r2 = *(u8 *)(r0 +39)
 R0_w=map_value(id=0,off=0,ks=4,vs=77,imm=0) R1_w=inv(id=0) R2_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R3_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R10=fp0 fp-8=mmmm????
14: (67) r2 <<= 8
15: (4f) r2 |= r3
16: (67) r2 <<= 16
17: (4f) r2 |= r1
18: (18) r1 = 0xfffffffd
20: (2d) if r1 > r2 goto pc+50
 R0=map_value(id=0,off=0,ks=4,vs=77,imm=0) R1=inv4294967293 R2=inv(id=0,smin_value=-9223372032559808516,umin_value=4294967293,var_off=(0xfffffffc; 0xffffffff00000003),s32_min_value=-3,s32_max_value=-1) R3=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R10=fp0 fp-8=mmmm????
21: (07) r2 += 3
22: (67) r2 <<= 32
23: (77) r2 >>= 32
24: (67) r2 <<= 3
25: (bf) r1 = r0
26: (1f) r1 -= r2
last_idx 26 first_idx 20
regs=4 stack=0 before 25: (bf) r1 = r0
regs=4 stack=0 before 24: (67) r2 <<= 3
regs=4 stack=0 before 23: (77) r2 >>= 32
regs=4 stack=0 before 22: (67) r2 <<= 32
regs=4 stack=0 before 21: (07) r2 += 3
regs=4 stack=0 before 20: (2d) if r1 > r2 goto pc+50
 R0_rw=map_value(id=0,off=0,ks=4,vs=77,imm=0) R1_rw=inv4294967293 R2_rw=invP(id=0) R3_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R10=fp0 fp-8=mmmm????
parent didn't have regs=4 stack=0 marks
last_idx 18 first_idx 0
regs=4 stack=0 before 18: (18) r1 = 0xfffffffd
regs=4 stack=0 before 17: (4f) r2 |= r1
regs=6 stack=0 before 16: (67) r2 <<= 16
regs=6 stack=0 before 15: (4f) r2 |= r3
regs=e stack=0 before 14: (67) r2 <<= 8
regs=e stack=0 before 13: (71) r2 = *(u8 *)(r0 +39)
regs=a stack=0 before 12: (71) r3 = *(u8 *)(r0 +38)
regs=2 stack=0 before 11: (4f) r1 |= r2
regs=6 stack=0 before 10: (71) r2 = *(u8 *)(r0 +36)
regs=2 stack=0 before 9: (67) r1 <<= 8
regs=2 stack=0 before 8: (71) r1 = *(u8 *)(r0 +37)
27: (71) r2 = *(u8 *)(r1 +65)
 R0=map_value(id=0,off=0,ks=4,vs=77,imm=0) R1_w=map_value(id=0,off=0,ks=4,vs=77,smin_value=-16,smax_value=0,umax_value=18446744073709551608,var_off=(0x0; 0xfffffffffffffff8),s32_max_value=2147483640,u32_max_value=-8) R2_w=invP(id=0,umax_value=16,var_off=(0x0; 0x18),s32_max_value=24,u32_max_value=24) R3=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R10=fp0 fp-8=mmmm????
R1 unbounded memory access, make sure to bounds check any such access
processed 26 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
Thu Oct 21 07:53:25 2021: Runtime error: bpf_load_program() err=13 event=filler/terminate_filler message=0: (85) call bpf_get_smp_processor_id#8
1: (63) *(u32 *)(r10 -4) = r0
2: (bf) r2 = r10
3: (07) r2 += -4
4: (18) r1 = 0xffff9bb9f23f2000
6: (85) call bpf_map_lookup_elem#1
7: (15) if r0 == 0x0. Exiting.

Environment

Kubernetes 1.20.10

  • Falco version:
    0.29.1
  • System info:
    {
    "machine": "x86_64",
    "nodename": "74e2a2435bbc",
    "release": "4.18.0-305.12.1.el8_4.x86_64",
    "sysname": "Linux",
    "version": "Digwatch compiler #1 SMP Wed Aug 11 01:59:55 UTC 2021"
    }
  • Cloud provider or hardware configuration:
  • OS:
    CentOS Linux release 8.4.210
  • Kernel:
    Linux 4.18.0-305.12.1.el8_4.x86_64 Digwatch compiler #1 SMP Wed Aug 11 01:59:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  • Installation method:
    Kubernetes

Additional context

Arguments passed to falco on start:

        - /usr/bin/falco
        - '--cri'
        - /run/containerd/containerd.sock
        - '-K'
        - /var/run/secrets/kubernetes.io/serviceaccount/token
        - '-k'
        - 'https://kubernetes.default:443'
        - '-pk'
        - '--disable-cri-async'

This issue could be related to #1690 but the failure is not the same.

@FedeDP
Copy link
Contributor

FedeDP commented Oct 22, 2021

Hi @jasiam !
Note that you are not experiencing a crash; instead, kernel eBPF verifier is rejecting the compiled bytecode.
Lots of fixes for eBPF were added in falco 0.30; can you upgrade?
See here for more info: falcosecurity/libs#81

@jasiam
Copy link
Author

jasiam commented Oct 22, 2021

Hi @FedeDP!
Thanks for the note about the ebpf verifier. I got the same failure with 0.30.0, but checking your PR I've noticed that falco-builder:latest docker image (which I use to compile the ebpf driver) includes clang version 5.0.1 (tags/RELEASE_501/final)
Could it be the reason of the compiled code being rejected by the ebpf verifier? Should it be bumped to clang 7?

@FedeDP
Copy link
Contributor

FedeDP commented Oct 25, 2021

Indeed in the PR me and @jasondellaluce tested back up to clang7; it can surely be an explanation, indeed the most likely one!
Are you able to build falco in your environment with clang-7 and report back?
A cmake -DUSE_BUNDLED_DEPS=True -DBUILD_BPF=true ../ && make falco && make bpf should be enough.
Then, to test, you need to issue a sudo FALCO_BPF_PROBE=./driver/bpf/probe.o ./userspace/falco/falco.
Thank you!

@jasiam
Copy link
Author

jasiam commented Oct 25, 2021

I've just tested using clang-11 (the easiest version to install in my environment) and it's working fine apparently :-)

So if I've understood correctly, official falcosecurity/falco-builder:latest image is not valid if you want to build bpf probe. Am I right? Should I open a new issue for this?

Thanks!

@FedeDP
Copy link
Contributor

FedeDP commented Oct 25, 2021

It's interesting! Can you build and test bpf with clang5 using the attached patch?
Thank you very much! If we can fix this, it is even better ;)

bpf.txt

(otherwise tomorrow i will test in qemu!)

@jasiam
Copy link
Author

jasiam commented Oct 26, 2021

It's still failing (same error trace) :-(

I'll detail steps I've followed, just to discard an error from my side. I've started from a falco-builder:latest container with all required dependencies:

  • Clone falco repo and checkout 0.30.0 tag
  • cmake -DBUILD_BPF=ON -DUSE_BUNDLED_DEPS=ON ..
  • I've applied the bpf.patch to FALCO_REPO_FOLDER/build/falcosecurity-libs-repo/falcosecurity-libs-prefix/src/falcosecurity-libs/driver/bpf/fillers.h
  • Check patch has been applied successfully
  • make bpf
  • Use the built probe.o in my helm falco deployment

I'd feel more comfortable if you can test it too.

Thanks!

@FedeDP
Copy link
Contributor

FedeDP commented Oct 26, 2021

Your steps seem fine!
Btw can you tell me how did you build bpf inside falco builder? I am not able to. A step by step guide would be really helpful :)
I am currently building clang5 on archlinux, but it will take quite long!

EDIT: for reference, what i'm doing:

docker run -e BUILD_BPF=on -e MAKE_JOBS=8 --user $(id -u):$(id -g) -v /etc/passwd:/etc/passwd:ro -it -v mysource:/source -v mybuild:/build -v /lib/modules/5.14.14-arch1-1/build falcosecurity/falco-builder cmake
...

docker run -e BUILD_BPF=on -e MAKE_JOBS=8 --user $(id -u):$(id -g) -v /etc/passwd:/etc/passwd:ro -it -v mysource:/source -v mybuild:/build -v /lib/modules/5.14.14-arch1-1/build falcosecurity/falco-builder bpf
Scanning dependencies of target bpf
make[4]: warning: jobserver unavailable: using -j1.  Add '+' to parent make rule.
make[5]: *** No targets specified and no makefile found.  Stop.
make[4]: *** [Makefile:20: all] Error 2
make[3]: *** [driver/bpf/CMakeFiles/bpf.dir/build.make:57: driver/bpf/CMakeFiles/bpf] Error 2
make[2]: *** [CMakeFiles/Makefile2:786: driver/bpf/CMakeFiles/bpf.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:798: driver/bpf/CMakeFiles/bpf.dir/rule] Error 2
make: *** [Makefile:418: bpf] Error 2

EDIT2: forgot to mention that i can build falco target though.

@jasiam
Copy link
Author

jasiam commented Oct 26, 2021

I use a custom entrypoint script as the command in a k8s job with falco-builder image. I do this because my nodes may have different kernels so I need to compile the driver on all of them (I deploy the k8s job on every node)

My entrypoint looks like this:

#!/bin/bash
set -e

rpm -i /root/falco/compilation/kernel-devel-${KERNEL_VERSION}.rpm

mkdir /lib/modules/${KERNEL_VERSION} && ln -s /usr/src/kernels/${KERNEL_VERSION} lib/modules/${KERNEL_VERSION}/build

FALCO_REPO_PATH=/root/falco/compilation/falco
FALCO_BUILD_PATH=${FALCO_REPO_PATH}/build

mkdir -p ${FALCO_BUILD_PATH} && cd ${FALCO_BUILD_PATH}

cmake -DBUILD_BPF=ON -DUSE_BUNDLED_DEPS=ON ..

make bpf

cp ./driver/bpf/probe.o /root/falco/drivers/falco-bpf.o

@FedeDP
Copy link
Contributor

FedeDP commented Oct 26, 2021

Thank you! Unfortunately, i guess my host /lib/modules is not compatible (too new):

scripts/mod/modpost: /lib64/libc.so.6: version `GLIBC_2.33' not found (required by scripts/mod/modpost)

I'll have to wait a couple of hours for the clang5 build to finish :(

@FedeDP
Copy link
Contributor

FedeDP commented Oct 27, 2021

Update: i reproduced the issue and currently have a working patch; i need more tests then i'll open a PR to fix it!

@FedeDP
Copy link
Contributor

FedeDP commented Oct 27, 2021

Hi! You can find the PR here: falcosecurity/libs#109

You can replicate the patch and check if it's working for you too (any feedback is greatly appreciated!)
Again, thank you very much for all the time spent helping me to debug this one! @jasiam

@jasiam
Copy link
Author

jasiam commented Oct 28, 2021

Hi!

I've just compiled probe.o with your new patch in a falco-builder container and I've got a new bpf error when falco starts up in my environment. You can check the error trace in this gist

Regarding init error messages about not able to compile bpf probe, be aware that I'm mounting the compiled probe.o as /root/.falco/falco-bpf.o in my falco container.

Thank you for your bpf knowledge @FedeDP!

@FedeDP
Copy link
Contributor

FedeDP commented Oct 28, 2021

That's weird :( i had verifier issues on sys_setsockopt_x and i fixed them, in my env.
May i ask you to do a couple of tests?

  • comment L1157 of driver/bpf/fillers.h:
    res = bpf_val_to_ring_type(data, sockopt_optname_to_scap(level, optname), PT_FLAGS8);
    and try again; if still won't work (same error), try instead to comment L 1164, ie:
    res = parse_sockopt(data, level, optname, (void*)optval, optlen);

Thank you very much! Hopefully we will be able to fix it!

@jasiam
Copy link
Author

jasiam commented Oct 28, 2021

No luck :-(

Error using probe.o compiled with L1157 commented here

Error using probe.o compiled with L1164 commented here

I'm sorry I can't help you more than testing your fixes with these kind of errors.

@FedeDP
Copy link
Contributor

FedeDP commented Oct 28, 2021

I'm sorry I can't help you more than testing your fixes with these kind of errors.

You are already doing a great job! Moreover, i feel sorry to bother you with looots of tests :(

Anyway, the issue seems to be related to sockopt_optname_to_scap() function, that i fixed yesterday.
Are you completely sure that you fully copied my patch from the open PR? ie: the driver/ppm_flag_helpers.h part too?

Again, forgive me if this is a stupid question, but debugging bpf takes lots of time, thus i need to be absolutely sure that the bug is real :)

@jasiam
Copy link
Author

jasiam commented Oct 28, 2021

Good news! I'm an idiot XD

I've been updating the wrong ppm_flag_helpers.h file, when I execute the cmake command, 2 files named like that are created:

  • ./build/driver/src/ppm_flag_helpers.h
  • ./build/falcosecurity-libs-repo/falcosecurity-libs-prefix/src/falcosecurity-libs/driver/ppm_flag_helpers.h

I've updated the first one, not the second one (I didn't see it the first time and all the tests have been performed using my bash shell history)

Now I've applied your PR patch correctly and it works absolutely fine :-)

Thanks a lot for your work!!

@FedeDP
Copy link
Contributor

FedeDP commented Oct 29, 2021

The libs PR is now merged thus i think we can consider this solved ;)
Again, thank you very much for your help, much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants