Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BPF issue on amazon linux 2 since we upgraded from 0.29.1 to 0.30 (not working on all kernel 4.1X and 5.X and using clang 7 or clang 11) #130

Closed
JoupainMD opened this issue Nov 15, 2021 · 24 comments · Fixed by #140
Labels
kind/bug Something isn't working

Comments

@JoupainMD
Copy link

Describe the bug
We are encountering issue with the BPF module since we upgrade from falco 0.29.1 to 0.30.
We are building the bpf probe using our own docker image (as an init container), we have been using the default clang llvm version for long (11) and we had to switch to clang7 since 0.29 if my memory is correct.
But now, it does not seems to work with both clang version
We are getting some stacktrace like this

math between map_value pointer and register with unbounded min value is not allowed
2021-11-15T15:22:14+0000: Runtime error: bpf_load_program() err=22 event=filler/sys_read_x message=0: (bf) r6 = r1

How to reproduce it
Using EKS v1.18 with amazon linux 2 and clang 7 or clang 11 (latest from amazon repo)

Environment
OS: amazon linux 2 (kernel : 4.14.219-161.340.amzn2.x86_64 but we also use 5.X kernel and the issue is the same)
Using EKS (AWS kubernetes) 1.18
Clang + LLVM : 7 and 11 (from amazon package repo)

  • Falco version:
  • 0.30
  • System info:
  • Cloud provider or hardware configuration:
  • OS:
  • Kernel:
  • Installation method:

Additional context
We also tried to use the latest version of this repository, especially following this issue

@JoupainMD JoupainMD added the kind/bug Something isn't working label Nov 15, 2021
@FedeDP
Copy link
Contributor

FedeDP commented Nov 15, 2021

Hi! Thanks for opening this bug report.

Note that

math between map_value pointer and register with unbounded min value is not allowed
2021-11-15T15:22:14+0000: Runtime error: bpf_load_program() err=22 event=filler/sys_read_x message=0: (bf) r6 = r1

is the kernel verifier that is rejecting eBPF probe bytecode!
By the way, we should support clang 7 up to 14 as you can see here: #81.
As you can see in the support matrix, clang11 passed all our tests.

Moreover, i recently pushed some fixes to support down to clang 5: #109.

All of this just to say (scream): "that's weird!" :D
I will try to blindly provide a patch for you: bpf.txt
to understand which part of the function is causing the error.
Unfortunately unless i am able to reproduce it, i will need your help to understand where is the problem and try to fix it!

@JoupainMD
Copy link
Author

The error seems to be the same, just to be sure with you, here is how I build the bpf module.
In my falco-ebpf-builder init container I am fetching falco from GitHub archive (using curl). Then I am running this command (from falco repo root directory):

cmake \
        -DCMAKE_BUILD_TYPE="release" \
        -DBUILD_DRIVER=OFF \
        -DFALCOSECURITY_LIBS_VERSION="1ed3e2a15dad1347459f1d55838bbbb8ae352266" \
        -DFALCOSECURITY_LIBS_CHECKSUM="SHA256=14801610411317af51bd636cd0ae5800c056e5dd52ef013ed22c28c3bad0168a" \
        -DBUILD_BPF=ON \
        -DBUILD_WARNINGS_AS_ERRORS=ON \
        -DFALCO_VERSION="${FALCO_VERSION}" \
        -DDRAIOS_DEBUG_FLAGS="" \
        -DUSE_BUNDLED_DEPS=ON \
        . 

then I am applying the patch by replacing the file located at:

  • /falco-repo/build/falcosecurity-libs-repo/falcosecurity-libs-prefix/src/falcosecurity-libs/driver/bpf/fillers.h

and then I make bpf from /falco-repo/build

Looks right to you (according to my understanding of the MakeFiles it seems correct).

@FedeDP
Copy link
Contributor

FedeDP commented Nov 16, 2021

Yes, it looks right! Thanks!

The error seems to be the same

Are you sure it is still about

err=22 event=filler/sys_read_x

? I'd expect sys_write_x!

@JoupainMD
Copy link
Author

Ah sorry it is sys_send_x, here is the entire message:

math between map_value pointer and register with unbounded min value is not allowed
2021-11-16T09:20:33+0000: Runtime error: bpf_load_program() err=22 event=filler/sys_send_x message=0: (bf) r6 = r1```

@FedeDP
Copy link
Contributor

FedeDP commented Nov 16, 2021

Nice! So the good thing is we found the guilty function. Bad thing is that now i will have to provide "pseudo random" patches to try to fix it :)
I will come back with some patches to test!
Thank you!

@FedeDP
Copy link
Contributor

FedeDP commented Nov 16, 2021

A couple of PRs:

Let me know if any of these works (ie: if they pass the verifier; they disable a feature indeed, but this is needed to better locate the issue!)

EDIT: please apply any patch that i send you starting with a clean libs (ie: from master)! Thanks!

@JoupainMD
Copy link
Author

JoupainMD commented Nov 16, 2021

disable bpf_probe_read output :

falco 2021-11-16T16:28:45+0000: Falco version 0.30.0 (driver version 3aa7a83bf7b9e6229a3824e3fd1f4452d1e95cb4)                                                                                                                                 
falco 2021-11-16T16:28:45+0000: Falco initialized with configuration file /etc/falco/falco.yaml                                                                                                                                                
falco 2021-11-16T16:28:45+0000: Loading rules from file /etc/falco/falco_rules.yaml:                                                                                                                                                           
falco 2021-11-16T16:28:45+0000: Loading rules from file /etc/falco/falco_rules.local.yaml:                                                                                                                                                     
falco 2021-11-16T16:28:46+0000: Loading rules from file /etc/falco/k8s_audit_rules.yaml:                                                                                                                                                      
falco 2021-11-16T16:28:46+0000: Unable to load the driver.                                                                                                                                                                                     
falco 2021-11-16T16:28:46+0000: Runtime error: invalid filler name: sys_openat2_x. Exiting.

Same output for disable compute_snaplen

@FedeDP
Copy link
Contributor

FedeDP commented Nov 16, 2021

Mmh did you reverted to master before applying the patches?
I mean, i expected one of them to fail on the kernel verifier :D

Runtime error: invalid filler name: sys_openat2_x. Exiting.

This error is weird though, but let's first focus on the kernel verifier (this error can be caused by the use of new bpf probe against an old falco version, i guess. 'Old' here means a falco version that originally had another libs version).

@JoupainMD
Copy link
Author

JoupainMD commented Nov 16, 2021

I did revert from master the file filler_helpers.h. But for the libs we are using commit version 1ed3e2a as mentioned in the cmake command ahead.
I will check again to be sure I did not miss anything or run the wrong container version.

About potential old bpf probe version asked by falco this is possible, we are using falco 0.30 release (not master) but libs currently we fetch+build from commit 1ed3e2a, to bypass probe version check, in our dockerfile we do override the PROBE_VERSION variable from /falco-repo/cmake/modules/falcosecurity-libs.cmake in order to match the version that falco 0.30 is wanting (in this case this is commit 3aa7a83)
Is it really crappy ? If so I can try to build falco from master also but we really plan to use falco from 0.30 and not build it from source on production.

@FedeDP
Copy link
Contributor

FedeDP commented Nov 16, 2021

Once the kernel verifier issue is fixed, we will try to understand the error you are getting about the invalid filler.
It should not be something to worry about though!

Edit: btw thanks for your time!

@FedeDP
Copy link
Contributor

FedeDP commented Nov 17, 2021

But for the libs we are using commit version 1ed3e2a as mentioned in the cmake command ahead.

I guess that if you use an older version of libs the

falco 2021-11-16T16:28:46+0000: Runtime error: invalid filler name: sys_openat2_x. Exiting.

message will disappear :)

@leogr
Copy link
Member

leogr commented Nov 17, 2021

We are encountering issue with the BPF module since we upgrade from falco 0.29.1 to 0.30. We are building the bpf probe using our own docker image (as an init container), we have been using the default clang llvm version for long (11) and we had to switch to clang7 since 0.29 if my memory is correct.

Hey @JoupainMD

Could you provide us the Dockerfile (or the docker image) you have used as init container? I think this is the only way for us to exactly reproduce your issue so we can then debug it. Currently, I haven't been able to reproduce it.

Thanks in advance! 🙏

@JoupainMD
Copy link
Author

Hello,

the Dockerfile

The entrypoint.sh

patch_falco I use to change it with the latest patch you provide.
One thing to note is that we use docker.io/library/amazonlinux:2.0.20211005.0 to be able to access the kernel header from amazon repo.
Maybe we can use a multi stage and only provide the kernel header to a Debian for example (and benefits from clang 12). Would it be better in your opinion ?

@FedeDP
Copy link
Contributor

FedeDP commented Nov 18, 2021

Hi @JoupainMD good news: i was able to reproduce your issue!
I am currently testing a fix 🤞

@FedeDP
Copy link
Contributor

FedeDP commented Nov 19, 2021

I opened a PR; can you test? @JoupainMD
#140

Thanks!

(obviously it did fix the issue for me!)

@JoupainMD
Copy link
Author

Hello @FedeDP I tested it on our environment but unfortunately I am still getting the same error
Runtime error: invalid filler name: sys_openat2_x. Exiting.

here is the exact version of ami we use : ami-02e17a76e494a9e99
kernel version: 4.14.219-161.340.amzn2.x86_64

@FedeDP
Copy link
Contributor

FedeDP commented Nov 19, 2021

Runtime error: invalid filler name: sys_openat2_x. Exiting.

This is an error coming from libsinsp, ie: it has nothing to do with kernel eBPF verifier (that is now satisfied!).
The issue is you are using falco v0.30 with new libs; you can try to backport my fix to libs shipped with falco v0.30; it should do the trick!

@JoupainMD
Copy link
Author

Ok that's clear I'll try that asap I'll keep you posted 🙏

@JoupainMD
Copy link
Author

WORKING on 4.14.219-161.340.amzn2.x86_64 👍 💯
Thanks @FedeDP and @leogr I will now try on our 5.X kernels as well 🙏

@JoupainMD
Copy link
Author

Working on 5.4.149-73.259.amzn2.x86_64 as well.
Perfect, thank you again guys 🙏
Ok for me to close this issue whenever you want.

@FedeDP
Copy link
Contributor

FedeDP commented Nov 19, 2021

Top!
The issue will be automatically closed once the PR is merged ;)

Thanks for your time!

@JoupainMD
Copy link
Author

I noticed some warnings on 5.4.149-73.259.amzn2.x86_64 (I am not sure it was present or not before your patch). see here.
Anyway it's working so not really important.

@FedeDP
Copy link
Contributor

FedeDP commented Nov 19, 2021

I think you can safely ignore them. Are you building with clang11 or clang7?

Btw if you could double check that they were present before my patch too, it would be great :) (i am 100% sure they were though, but a double check is worth the time!)

@JoupainMD
Copy link
Author

Yep you are right, already here in 0.29.1, we didn't notice (only on 5.X kernels it looks like).

leogr pushed a commit to leogr/libs that referenced this issue Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants