Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trivy operator behaviour when containers are injected into deployment pod by admissions controllers #1745

Open
andriktr opened this issue Jan 5, 2024 · 15 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. target/kubernetes Issues relating to kubernetes cluster scanning

Comments

@andriktr
Copy link

andriktr commented Jan 5, 2024

Hello,
When we run a deployment in k8s trivy-operator looks into the replica set get the images for vulnerability scanning. The problem I see here is in case it's not covers cases when mutating admission controller injects container into the pod. Good example here is a case when we inject service mesh sidecars into our application. Usually then 1 or more containers being added into the pod however replica set normally remains unchanged and contains only the application image in the definition. As result in reality we have pods with application containers + containers injected by admission controllers and as trivy look on replicaset only the application container will be scanned.

Here is the image grep for the replica set:

image

As you can see we have only 1 image

And here we grep image from the pods controlled by this replica set
image

As u can see we have 2 additional images there consul-dataplane:1.2.3 is consul service mesh sidecar container image and consul-k8s-control-plane:1.2.3 is an init container image both these images are injected into pod by admission controller and they are not scanned by trivy-operator.

Any suggestions, opinions ?

Thank you.

@andriktr andriktr added the kind/bug Categorizes issue or PR as related to a bug. label Jan 5, 2024
@chen-keinan
Copy link
Collaborator

chen-keinan commented Jan 8, 2024

@andriktr that is a very good point , there is a way to know if the service mesh has added a sidecars , I mean like labels or annotations ?

@chen-keinan chen-keinan added kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. target/kubernetes Issues relating to kubernetes cluster scanning and removed kind/bug Categorizes issue or PR as related to a bug. labels Jan 8, 2024
@andriktr
Copy link
Author

andriktr commented Jan 8, 2024

I believe this depends on mesh solution, but typically u enable service mesh by adding an annotation to deployment/pod.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: static-client
  namespace: consul-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: static-client
  template:
    metadata:
      name: static-client
      labels:
        app: static-client
      annotations:
        'consul.hashicorp.com/connect-inject': 'true'
....

However it also possible to enable injection by default for all workloads in cluster and I'm not sure if annotations will be added in this case as well, but most probably yes.

@chen-keinan
Copy link
Collaborator

I'm looking for an indicator which will help to decide when to scan pod and when replicaset

@andriktr
Copy link
Author

andriktr commented Jan 8, 2024

Probably it will be hard to have one indicator for all possible injection cases. Maybe it possible to do some comparison between replicaset and final pod and if container count in replicaset is equal to pods container count then scan replicaset else scan pod.

@alfsch
Copy link

alfsch commented Feb 27, 2024

comparing image names between replicaset and final pods seem to be the only way to avoid troubles with injected sidecars by service meshes or mutation hooks.

@chen-keinan
Copy link
Collaborator

chen-keinan commented Mar 18, 2024

@andriktr @alfsch I have created a fix #1917 which will still scan the app container when sidecar if failing.
its not perfect but at least you'll get vulnerability report for main container.

let me know if it sufficient

@andriktr
Copy link
Author

andriktr commented Mar 18, 2024

Hmm... this issue is not about a failing containers, but more about that sidecar container is not scanned when it injected by admission controller(-s) does your fix covers this as well ?

@chen-keinan
Copy link
Collaborator

chen-keinan commented Mar 18, 2024

Hmm... this issue is not about a failing containers, but more about that sidecar container is not scanned when it injected by admission controller(-s) does your fix covers this as well ?

no, unfortunately, I thought it might help as if a pod has more than one container and one is failing while other passing then until now you'll not get any report for either of containers

@alfsch
Copy link

alfsch commented Mar 18, 2024

@andriktr the above mentioned annotations and labels aren't added in all cases. If you uses kyverno image mutations, it's inside a kyverno rule resource and no annotation which can give a hint.
In case of istio service mesh it's possible to do it globally without any annotation or label, with label on namespace and with label on workloads pod-template.

@andriktr
Copy link
Author

@andriktr the above mentioned annotations and labels aren't added in all cases. If you uses kyverno image mutations, it's inside a kyverno rule resource and no annotation which can give a hint.

In case of istio service mesh it's possible to do it globally without any annotation or label, with label on namespace and with label on workloads pod-template.

This actually obvious and depends on solution. Actually most effective way would be to just query for unique running images in a namespace and scan them and not relay on the replica set definition att all

@chen-keinan
Copy link
Collaborator

@andriktr the above mentioned annotations and labels aren't added in all cases. If you uses kyverno image mutations, it's inside a kyverno rule resource and no annotation which can give a hint.
In case of istio service mesh it's possible to do it globally without any annotation or label, with label on namespace and with label on workloads pod-template.

This actually obvious and depends on solution. Actually most effective way would be to just query for unique running images in a namespace and scan them and not relay on the replica set definition att all

trivy-operator do not perform query it follow operator pattern (event base) for every (new, update, deletion) of resource

@andriktr
Copy link
Author

@andriktr the above mentioned annotations and labels aren't added in all cases. If you uses kyverno image mutations, it's inside a kyverno rule resource and no annotation which can give a hint.

In case of istio service mesh it's possible to do it globally without any annotation or label, with label on namespace and with label on workloads pod-template.

This actually obvious and depends on solution. Actually most effective way would be to just query for unique running images in a namespace and scan them and not relay on the replica set definition att all

trivy-operator do not perform query it follow operator pattern (event base) for every (new, update, deletion) of resource

When either replicaset and final pod should be compared or images which should be scanned should be taken from the pod only and if there are more than one pod with same image report should sort it out to avoid duplicated info.

Alternatively operator behaviour could be changed to track running images in a namespace instead of running replicasets.

@alfsch
Copy link

alfsch commented Mar 18, 2024

@andriktr #1872 (comment) describes some case which also have to be handled. The truth is only in the pods running in a namespace or in the relation between the higher level workload descriptions like deployment/replicasets/.... and their pods.

@jutley
Copy link

jutley commented Mar 27, 2024

I have a couple thoughts on how this could potentially be handled. Like @andriktr, I am also missing scans on some images which only get added to pods via mutations.

The brute force approach would be to watch all pods (I think this is already happening) and getting the set of images in them, and then comparing that set to the controller (the replicaset, for example). If there are additional images in the pod's set, then add these into the scan.

A slightly more elegant (but still incomplete) approach would be to leverage Kubernetes v1.28's support for sidecar containers. The mechanism here is to run sidecar containers as init containers which set restartPolicy to Always. This capability is protected by a feature flag and is enabled by default in v1.29. However, many clusters won't have this capability, and even when they do, many mutating webhooks will not use it. This may be a good approach a year from now, but not for today.

@chen-keinan
Copy link
Collaborator

A slightly more elegant (but still incomplete) approach would be to leverage Kubernetes v1.28's support for sidecar containers. The mechanism here is to run sidecar containers as init containers which set restartPolicy to Always. This capability is protected by a feature flag and is enabled by default in v1.29. However, many clusters won't have this capability, and even when they do, many mutating webhooks will not use it. This may be a good approach a year from now, but not for today.

@jutley thanks for this input. I'll have a look to see how it fir our operator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. target/kubernetes Issues relating to kubernetes cluster scanning
Projects
None yet
Development

No branches or pull requests

4 participants