Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect event values for container.image.* when using image digests #1187

Closed
plasticine opened this issue May 5, 2020 · 38 comments
Closed

Comments

@plasticine
Copy link

Hey there — thanks very much for Falco, it’s an amazing bit of software! 👋

So, we almost exclusively deploy all our images by sha256 digest, as opposed to by tag, and when attempting to update Falco from 0.17.0 to 0.22.1 on a bunch of our k8s clusters we’ve observed that events all seem to have the following container.image.repository, and container.image.tag values;

"container.image.repository":"sha256","container.image.tag":"[DIGEST]"

How to reproduce it

I’ve validated this behavior on GKE nodes running 1.15.x on top of COS with Containerd, and have re-deployed our falco install from scratch using https://github.com/falcosecurity/falco/tree/master/integrations/k8s-using-daemonset/k8s-with-rbac

Expected behaviour

I would expect that the repository field would contain the actual image repository, and the tag field either the tag, or digest.

"container.image.repository":"some-image","container.image.tag":"sha256:[DIGEST]"

...or maybe even better, a new field digest in the event that tag is null and the image is being referenced by digest;

"container.image.repository":"some-image","container.image.tag":null,"container.image.digest":"sha256:[DIGEST]"

Environment

  • Falco version:
Falco version: 0.22.1
Driver version: a259b4bf49c3330d9ad6c3eed9eb1a31954259a6
  • System info:
{
  "machine": "x86_64",
  "nodename": "falco-bb4lb",
  "release": "4.19.104+",
  "sysname": "Linux",
  "version": "#1 SMP Wed Feb 19 05:26:34 PST 2020"
}
@fntlnz
Copy link
Contributor

fntlnz commented May 8, 2020

Thanks for reporting @plasticine ! I'll try to investigate this.

@fntlnz
Copy link
Contributor

fntlnz commented May 8, 2020

@plasticine
Copy link
Author

Hey there @fntlnz — just wanted to check in and see if you managed to uncover anything here? We’re still seeing this issue, though I’ll try with the recently released 0.23.X.

@fntlnz
Copy link
Contributor

fntlnz commented May 21, 2020

Hi @plasticine Thanks for the detailed issue! - @leodido and I debugged this for quite some time on GKE 1.16.8-gke.15 and the cos_containerd node image - We're deploying Falco as described here https://github.com/falcosecurity/contrib/tree/master/deploy/kubernetes/kernel-and-k8s-audit with the FALCO_BPF_PROBE variable enabled.

After Falco is deployed and running, we deployed the event generator as it follows:

kubectl run unsecure-falco-example --image falcosecurity/event-generator@sha256:fd2c6c80854e1ee894f8905f8e05fbd4059c6ce401434503801110f549b7d595 -- run

Here is a sample of some events we had:

10:23:02.135632174: Error Package management process launched in container (user=root command=apk container_id=b1ca96994aee container_name=unsecure-falco-example image=docker.io/falcosecurity/event-generator:latest) k8s.ns=default k8s.pod=unsecure-falco-example container=b1ca96994aee k8s.ns=default k8s.pod=unsecure-falco-example container=b1ca96994aee k8s.ns=default k8s.pod=unsecure-falco-example container=b1ca96994aee
10:16:03.381277704: Notice Unexpected setuid call by non-sudo, non-root program (user=bin cur_uid=2 parent=child command=child --loglevel info run ^syscall.NonSudoSetuid$ uid=root container_id=b5991414d3c5 image=docker.io/falcosecurity/event-generator) k8s.ns=default k8s.pod=unsecure-falco-example container=b5991414d3c5 k8s.ns=default k8s.pod=unsecure-falco-example container=b5991414d3c5 k8s.ns=default k8s.pod=unsecure-falco-example container=b5991414d3c5

Can you give us more details on how the pods are deployed? If you can try our kubectl run posted above and see if this happens with it it will be useful (be aware that the event-generator container triggers Falco a lot and it's not safe to run it in production environments)

@leodido
Copy link
Member

leodido commented May 21, 2020

Also, in our debugging session, we have containerd version:

github.com/containerd/containerd 1.2.8 a4bc1d432a2c33aa2eed37f338dceabb93641310

Could you report the version you have in you setup?

@dpittner
Copy link

Maybe this helps to pin-point things further, I'm seeing the same as @plasticine without BPF enabled on my env (falco 0.23.0):
{
"machine": "x86_64",
"nodename": "falco-2jd7j",
"release": "4.4.0-177-generic",
"sysname": "Linux",
"version": "#207-Ubuntu SMP Mon Mar 16 01:16:10 UTC 2020"
}
it's an IBM Cloud IKS 1.15

@plasticine
Copy link
Author

plasticine commented May 29, 2020 via email

@stale
Copy link

stale bot commented Jul 29, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Issues labeled "cncf", "roadmap" and "help wanted" will not be automatically closed. Please refer to a maintainer to get such label added if you think this should be kept open.

@stale stale bot added wontfix and removed wontfix labels Jul 29, 2020
@poiana poiana closed this as completed Jul 29, 2020
@poiana
Copy link

poiana commented Jul 29, 2020

@fntlnz: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fntlnz
Copy link
Contributor

fntlnz commented Jul 29, 2020

Oh no i didn't want to close this!

/reopen

@poiana poiana reopened this Jul 29, 2020
@poiana
Copy link

poiana commented Jul 29, 2020

@fntlnz: Reopened this issue.

In response to this:

Oh no i didn't want to close this!

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@plasticine
Copy link
Author

@fntlnz Sorry 🙈 I’ve been meaning to try and get back to this for ages, I’ll try and make time this week! I’ve been wondering if there is something about our gke k8s that is tripping things up; it’s a stretch, but I’m wondering if binary auth is possibly in the mix, as that’s something we leverage heavily.

@fntlnz
Copy link
Contributor

fntlnz commented Jul 29, 2020

Thanks @plasticine - no need to be sorry, time flies! Thanks for your help!

@stale
Copy link

stale bot commented Sep 27, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Issues labeled "cncf", "roadmap" and "help wanted" will not be automatically closed. Please refer to a maintainer to get such label added if you think this should be kept open.

@stale stale bot added the wontfix label Sep 27, 2020
@leogr
Copy link
Member

leogr commented Sep 28, 2020

Is this problem still present in Falco 0.25.0 ?

@stale stale bot removed the wontfix label Sep 28, 2020
@lucasteligioridis
Copy link

Looks like this issue still exists on 0.26.2, just tested it now with the exact configuration that @plasticine has.

@lucasteligioridis
Copy link

Are there any specific things I can provide to make debugging this easier? From our configuration that is.

@lucasteligioridis
Copy link

@plasticine and I have a suspicion that this is related to having binary authorization enabled on our GKE clusters.
https://cloud.google.com/binary-authorization

@fntlnz was binary auth enabled when you tested this with your GKE configuration?

@leogr
Copy link
Member

leogr commented Nov 24, 2020

Hey @lucasteligioridis

Could you share the detailed steps to reproduce the problem?

Thanks in advance!

@lucasteligioridis
Copy link

@leogr We literally just create a new GKE cluster now running 1.17 with Binary authorization enabled and then deploy the falco workload.

You should then be able to replicate the issue as per the original post.

@poiana
Copy link

poiana commented Feb 23, 2021

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@poiana
Copy link

poiana commented Mar 25, 2021

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

@poiana
Copy link

poiana commented Apr 24, 2021

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

@poiana
Copy link

poiana commented Apr 24, 2021

@poiana: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@poiana poiana closed this as completed Apr 24, 2021
@sebeliasson
Copy link

I'm also seeing this behavior on Azure AKS v1.19.7 however I've seen this only from the Nginx Ingress Controller so far. Other containers seems to work fine and output the %container.image.repository and %container.image.tag correctly.

I tried to create a reproducable scenario on minikube but was unable to. It worked fine.

Falco info

* Setting up /usr/src links from host
* Running falco-driver-loader for: falco version=0.28.1, driver version=5c0b863ddade7a45568c0ac97d037422c9efb750
* Running falco-driver-loader with: driver=module, compile=yes, download=yes
* Unloading falco module, if present
* Trying to load a system falco module, if present
* Success: falco module found and loaded with modprob

Node info

{
  "architecture": "amd64",
  "bootID": "",
  "containerRuntimeVersion": "containerd://1.5.0-beta.git31a0f92df+azure",
  "kernelVersion": "5.4.0-1043-azure",
  "kubeProxyVersion": "v1.19.7",
  "kubeletVersion": "v1.19.7",
  "machineID": "",
  "operatingSystem": "linux",
  "osImage": "Ubuntu 18.04.5 LTS",
  "systemUUID": ""
}

Example alert

14:50:35.363121296: Notice Unexpected connection to K8s API Server from container (command=nginx-ingress-c --publish-service=test/lb-ingress-nginx-controller --election-id=ingress-controller-leader --ingress-class=nginx --configmap=test/lb-ingress-nginx-controller --validating-webhook=:8443 --validating-webhook-certificate=/usr/local/certificates/cert --validating-webhook-key=/usr/local/certificates/key --default-ssl-certificate=[masked] k8s.ns=test k8s.pod=lb-ingress-nginx-controller-597d69f489-gkt42 container=93b73c8ba8be image=sha256:0975b5aefeaca5f8398cf4c591b2e0024184839e3bf780e843b0c17ecd7a85e6 connection=10.0.40.30:54132->10.250.0.1:443) k8s.ns=test k8s.pod=lb-ingress-nginx-controller-597d69f489-gkt42 container=93b73c8ba8be k8s.ns=test k8s.pod=lb-ingress-nginx-controller-597d69f489-gkt42 container=93b73c8ba8be

Nginx Ingress Controller deployed like so

resource "helm_release" "ingress-controller-blue" {
    count             = 1
    name              = "lb"
    repository        = "https://kubernetes.github.io/ingress-nginx"
    chart             = "ingress-nginx"
    version           = "3.27.0"
    namespace         = "test"
    create_namespace  = true
}

@sebeliasson
Copy link

/reopen

@poiana
Copy link

poiana commented May 18, 2021

@ejderdal: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Andreagit97
Copy link
Member

The issue seems to be still present: https://kubernetes.slack.com/archives/CMWH3EH32/p1681661535622169

@Andreagit97 Andreagit97 reopened this Apr 17, 2023
@Andreagit97 Andreagit97 added this to the 0.36.0 milestone Apr 17, 2023
@incertum
Copy link
Contributor

Hi @plasticine 👋

We patched the container engine in this regard a bit here https://github.com/falcosecurity/libs/pull/771/files.

Also sometimes see sha256 as container.image.repository, can help 👀 into it, had it on my list already.

@sigurdfalk
Copy link

Some of our images also uses the format <registry>/<repository>/<image>:<tag>@sha256:<digest> just for more context for testing 😊

@incertum
Copy link
Contributor

Amazing ty will start looking into it next week after KubeCon and will ping you on slack as well 🙏 !

@incertum
Copy link
Contributor

@sigurdfalk tagged you in the PR. Added backup lookups ... after that I wouldn't know where else to extract the image from, searched the entire container status response. It certainly isn't a Falco bug, sometimes it simply just is sha256. I queried Kubernetes audit logs to confirm this. What I don't know however is if in such corner cases the image from the annotations would also just be sha256. In that case it would be game over.

@poiana
Copy link

poiana commented May 29, 2023

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

@poiana poiana closed this as completed May 29, 2023
@poiana
Copy link

poiana commented May 29, 2023

@poiana: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@leogr
Copy link
Member

leogr commented Jun 6, 2023

@incertum @FedeDP

My understanding is that this issue has been fixed in 0.35.0. Is that correct?

@FedeDP
Copy link
Contributor

FedeDP commented Jun 6, 2023

Yep@ melissa fixed that if i remember correctly!

@incertum
Copy link
Contributor

incertum commented Jun 6, 2023

Correct, @gnosek PR had the biggest impact (falcosecurity/libs#771), but to increase robustness even more I added backup lookups (falcosecurity/libs#1067) for the Kubernetes cases (cri, containerd and cri-o).

In summary we now try to look up the container image from all possible places in the container status response, especially for the Kubernetes use case.

We can mark this as completed for 0.35.0 and should there still be issues, we can continue working on it.

@FedeDP
Copy link
Contributor

FedeDP commented Jun 6, 2023

/milestone 0.35.0

@poiana poiana modified the milestones: 0.36.0, 0.35.0 Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests