Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The node-agent is repeatedly crashing with a CrashLoopBackOff error : cannot create bpf perf link #384

Closed
antonissmal opened this issue Jan 29, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@antonissmal
Copy link

Description

I'm encountering an issue with the node-agent in my Kubernetes cluster running Kubescape. The node-agent is repeatedly crashing with a CrashLoopBackOff error, and I'm seeing the following error messages in the logs:

{"level":"error","ts":"2024-01-29T12:13:19Z","msg":"error starting exec tracing","error":"creating tracer: attaching exit tracepoint: cannot create bpf perf link: permission denied"}
{"level":"fatal","ts":"2024-01-29T12:13:19Z","msg":"error starting the container watcher","error":"starting app behavior tracing: creating tracer: attaching exit tracepoint: cannot create bpf perf link: permission denied"}

Environment

OS: Linux (kernel version: 5.14.0-362.13.1.el9_3.x86_64)
Kubescape : v3.0.3
Helm chart : v1.8.1
Kubernetes Server : v1.27.6
Kubernetes Client : v1.26.3

Expected behavior

The node-agent pods should start successfully and not crash with the CrashLoopBackOff error.

Actual Behavior

The node-agent pods are repeatedly crashing with a CrashLoopBackOff error, and the logs show errors related to creating a BPF perf link due to permission denied.

Additional context

The node-agent container is using the quay.io/kubescape/node-agent:v0.1.114 image, and the pod has resource limits and requests defined.

I've checked the AppArmor profile for the node-agent container, and it's set to "unconfined," indicating that it's not confined by AppArmor security policies.

I also reviewed the RBAC permissions and ClusterRoleBinding for the node-agent service account, and it appears to have the necessary permissions and already checked kernel capabilities and seems to be right.

I use cilium CNI on the cluster.

@antonissmal antonissmal added the bug Something isn't working label Jan 29, 2024
@dwertent
Copy link
Contributor

dwertent commented Jan 29, 2024

@antonissmal thank you for reporting.
We are looking into it.
Can you please share some more information about the cluster? is it a managed cluster? ks3? etc.

Edit:
Does this help you?

@antonissmal
Copy link
Author

antonissmal commented Jan 29, 2024

Hello,
Thank you for your response.

Kubeadm self-managed cluster: v1.27.6
ContainerD: containerd.io v1.6.22

Another extra info is that i already use cilium as CNI that uses ebpf.

I saw also today that another same issue has another guy and open an issue in kubescape repo:
kubescape/kubescape#1596

@dwertent
Copy link
Contributor

Hi, @antonissmal
Please follow the workaround mentioned in this link: here.
After you try it, let me know if it works for you. If so, we'll make sure to update the troubleshooting documentation accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants