Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Modify capabilities on injected container #282

Closed
tspearconquest opened this issue Nov 11, 2021 · 9 comments
Closed

[FEATURE] Modify capabilities on injected container #282

tspearconquest opened this issue Nov 11, 2021 · 9 comments
Labels
enhancement New feature or request

Comments

@tspearconquest
Copy link
Contributor

tspearconquest commented Nov 11, 2021

Is your feature request related to a problem? Please describe.

  • We are looking for ways to further enhance the security of our infrastructure, and so we want to add an admission hook which prevents pods from being deployed to the cluster if they do not drop certain capabilities in each of their containers via Pod.spec.containers.securityContext.capabilities.drop and in their initContainers via Pod.spec.initContainers.securityContext.capabilities.drop in the Pod or Deployment manifest.
  • Currently, AKV2K8S does not provide us a method to modify the manifest of the injected init container, and so we cannot deploy this requirement because the admission controller would begin to block the injected init container in our pods.

Describe the solution you'd like

  • We would like to have AKV2K8S injected containers default to dropping all capabilities that they do not need, or provide us a method to configure this via environment variables for the environment injector deployment.

Describe alternatives you've considered

  • None, this is a security requirement and so is AKV2K8S.

Additional context

  • Please note that the namespace and other exact private details provided in the below examples has been changed from our real data to protect it for security purposes, but the overall manifests are very similar to what you see below
  • The admission controller is configured via the Gatekeeper project which we have already been running in our cluster for pods which do not require AKV2K8S
  • Gatekeeper provides a kind: CustomResourceDefinition which allows you to create custom kind: ConstraintTemplate.
  • The Constraint Template uses the rego language to define the requirements of a Constraint for the K8S Admission Controller to follow.
    • This template is used by a second manifest (kind: Constraint) applied on each namespace in order to define the constraints on that namespace. I will share the constraint manifest further down.
  • We use the below Constraint Template already for any namespaces where we have exclusively pods that don't require AKV2K8S, and for namespaces which do have pods which require AKV2K8S, we have to configure our CI pipelines to not deploy this Constraint Template as well as the Constraint. Because these are not deployed to those namespaces, it reduces the security in those namespaces because containers are not forced to drop capabilities they don't need:
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  annotations:
    description: Controls Linux capabilities.
  name: k8sallowedcapabilities
spec:
  crd:
    spec:
      names:
        kind: k8sallowedcapabilities
      validation:
        openAPIV3Schema:
          properties:
            allowedCapabilities:
              items:
                type: string
              type: array
            requiredDropCapabilities:
              items:
                type: string
              type: array
  targets:
  - rego: |
      package k8sallowedcapabilities

      violation[{"msg": msg}] {
        initcontainer := input_init_containers[_]
        initcontainer_has_disallowed_capabilities(initcontainer)
        msg := sprintf("init container <%v> has a disallowed capability <%v>. Allowed capabilities are %v", [initcontainer.name, initcontainer.securityContext.capabilities.add, get_default(input.parameters, "initContainerAllowedCapabilities", "NONE")])
      }

      initcontainer_has_disallowed_capabilities(container) {
        allowed := {c | c := input.parameters.initContainerAllowedCapabilities[_]}
        not allowed["*"]
        capabilities := {c | c := container.securityContext.capabilities.add[_]}
        count(capabilities - allowed) > 0
      }

      violation[{"msg": msg}] {
        initcontainer := input_init_containers[_]
        initcontainer_missing_drop_capabilities(initcontainer)
        msg := sprintf("container <%v> is not dropping all required capabilities. Container must drop all of %v", [initcontainer.name, input.parameters.initContainerRequiredDropCapabilities])
      }

      initcontainer_missing_drop_capabilities(container) {
        must_drop := {c | c := input.parameters.initContainerRequiredDropCapabilities[_]}
        dropped := {c | c := container.securityContext.capabilities.drop[_]}
        count(must_drop - dropped) > 0
      }

      input_init_containers[c] {
          c := input.review.object.spec.initContainers[_]
      }

      violation[{"msg": msg}] {
        container := input_containers[_]
        container_has_disallowed_capabilities(container)
        msg := sprintf("container <%v> has a disallowed capability <%v>. Allowed capabilities are %v", [container.name, container.securityContext.capabilities.add, get_default(input.parameters, "allowedCapabilities", "NONE")])
      }

      container_has_disallowed_capabilities(container) {
        allowed := {c | c := input.parameters.allowedCapabilities[_]}
        not allowed["*"]
        capabilities := {c | c := container.securityContext.capabilities.add[_]}
        count(capabilities - allowed) > 0
      }

      violation[{"msg": msg}] {
        container := input_containers[_]
        container_missing_drop_capabilities(container)
        msg := sprintf("container <%v> is not dropping all required capabilities. Container must drop all of %v", [container.name, input.parameters.requiredDropCapabilities])
      }

      container_missing_drop_capabilities(container) {
        must_drop := {c | c := input.parameters.requiredDropCapabilities[_]}
        dropped := {c | c := container.securityContext.capabilities.drop[_]}
        count(must_drop - dropped) > 0
      }

      input_containers[c] {
          c := input.review.object.spec.containers[_]
      }

      get_default(obj, param, _default) = out {
        out = obj[param]
      }

      get_default(obj, param, _default) = out {
        not obj[param]
        not obj[param] == false
        out = _default
      }

    target: admission.k8s.gatekeeper.sh

  • Here is the constraint manifest:
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: k8sallowedcapabilities
metadata:
  annotations:
    description: Kubernetes cluster containers should only use allowed capabilities.
  name: pod-allowed-capabilities
spec:
  enforcementAction: deny
  match:
    namespaces:
    - akv2k8s-pods
    kinds:
    - apiGroups:
      - ""
      kinds:
      - Pod
  parameters:
    initContainerAllowedCapabilities: []
    allowedCapabilities: []
    initContainerRequiredDropCapabilities:
    - ALL
    requiredDropCapabilities:
    - ALL

  • Here is manifest section for the init containers which is the injected initContainer from the environment injector. We get this by running kubectl get pod/pod-name -n akv2k8s-pods -o yaml on one of the pods in the akv2k8s-pods namespace, but we can only modify certain things like the image URL and tag via the environment variables on the env injector pod:
  initContainers:
  - command:
    - sh
    - -c
    - cp /usr/local/bin/azure-keyvault-env /azure-keyvault/
    image: registry.gitlab.com/private/registry/akv2k8s/azure-keyvault-env:v1.3.0
    imagePullPolicy: IfNotPresent
    name: copy-azurekeyvault-env
    resources:
      limits:
        cpu: 50m
        memory: 100Mi
      requests:
        cpu: 20m
        memory: 50Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /azure-keyvault/
      name: azure-keyvault-env
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: pod-token-2p7fx
      readOnly: true

  • What we need is to make the above manifest look as below:
  initContainers:
  - name: copy-azurekeyvault-env
    command:
    - sh
    - -c
    - cp /usr/local/bin/azure-keyvault-env /azure-keyvault/
    image: registry.gitlab.com/private/registry/akv2k8s/azure-keyvault-env:v1.3.0
    imagePullPolicy: IfNotPresent
    securityContext:      <-- newly added lines start here
      capabilities:
        drop:
        - ALL             <-- newly added lines end here
    resources:
      limits:
        cpu: 50m
        memory: 100Mi
      requests:
        cpu: 20m
        memory: 50Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /azure-keyvault/
      name: azure-keyvault-env
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: pod-token-2p7fx
      readOnly: true

  • The output below is what we see when we try to deploy a pod with this constraint applied on the namespace
[denied by pod-allowed-capabilities] container <copy-azurekeyvault-env> is not dropping all required capabilities. Container must drop all of ["ALL"]
Warning FailedCreate 10m (x8 over 11m) replicaset-controller Error creating: admission webhook "validation.gatekeeper.sh" denied the request: [denied by pod-allowed-capabilities] container <copy-azurekeyvault-env> is not dropping all required capabilities. Container must drop all of ["ALL"]

I should also add that we do not necessarily have to drop "ALL" capabilities, as we can configure the constraint so that it will only force the dropping of the capabilities that are not required, rather than "ALL"

Additionally, I would like to point out that this is not an issue for the webhook pod itself, as we are able to modify the manifest for that pod, it is only the manifest for the environment injector which gets generated from the Go code here in this project which is unable to be modified.

@tspearconquest tspearconquest added the enhancement New feature or request label Nov 11, 2021
@Xtr102
Copy link

Xtr102 commented Feb 9, 2023

We have similar requirements. Also using Gatekeeper, but with different constraints.

Do you know if this has been addressed or if someone is working on it?

@torresdal
Copy link
Collaborator

This is something we're planning on looking into now. We would like to drop all capabilities, but it's not 100% clear to me at the moment which ones to add back. Any pointers here could save us a lot of time. Thanks.

@Speeddymon
Copy link

I would start by looking at the capabilities list here: https://man7.org/linux/man-pages/man7/capabilities.7.html

Check if any of those match anything in the code of the "wrapper" that runs in the entrypoint of the container once the webhook does it's mutation.

If none match, dropping all will probably work.

@Speeddymon
Copy link

For the init container, I believe dropping all should work without any issue, because it's just using the shell to copy the file into the regular container(s)

@torresdal
Copy link
Collaborator

Is this something you would like to look into @Olsenius? You have access to more test clusters than me at the moment.

@HammerNL89
Copy link

We are using gatekeeper on our AKS clusters and are also interested in an option to configure the securityContext for the initContainer.
Related to this question: #498

@Speeddymon
Copy link

@HammerNL89 I'm going to investigate soon using if we can accomplish this using Gatekeeper Mutations while we wait for the feature from the project.

@YouJinTou
Copy link
Contributor

We've added a pull request for this:

#548

@tspearconquest
Copy link
Contributor Author

tspearconquest commented Jul 20, 2023

I am good to close out this issue, and #498 can also be closed, as the feature was released in webhook-1.5.0.

For those who cannot immediately update to webhook version 1.5.0, I offer the following solution if you are using Gatekeeper 3.10.0 in your cluster. Simply apply the below manifests for Gatekeeper to override the injected container's security context (and image pull policy) via Gatekeeper's Assign mutation resource.

Simply apply the below 2 manifests to your cluster, and restart your workloads to get the updated injection.

As a side note: You may want to also consider changing your Gatekeeper mutating webhook's reinvocationPolicy from Never to IfNeeded to ensure that the mutating webhook will re-review the pod and mutate it after akv2k8s has, itself, mutated it.

apiVersion: mutations.gatekeeper.sh/v1
kind: Assign
metadata:
  name: assign-akv-injected-container-image-pull-policy
spec:
  applyTo:
  - groups:
    - ""
    kinds:
    - Pod
    versions:
    - v1
  location: 'spec.initContainers[name: copy-azurekeyvault-env].imagePullPolicy'
  match:
    excludedNamespaces:
    - kube-node-lease
    - kube-public
    - kube-system
    kinds:
    - apiGroups:
      - ""
      kinds:
      - Pod
    namespaceSelector:
      matchExpressions:
      - key: azure-key-vault-env-injection
        operator: Exists
    scope: Namespaced
  parameters:
    assign:
      value:
        Always
    pathTests:
    - subPath: "spec.initContainers[name: copy-azurekeyvault-env]"
      condition: MustExist
---
apiVersion: mutations.gatekeeper.sh/v1
kind: Assign
metadata:
  name: assign-akv-injected-container-security-context
spec:
  applyTo:
  - groups:
    - ""
    kinds:
    - Pod
    versions:
    - v1
  location: 'spec.initContainers[name: copy-azurekeyvault-env].securityContext'
  match:
    excludedNamespaces:
    - kube-node-lease
    - kube-public
    - kube-system
    kinds:
    - apiGroups:
      - ""
      kinds:
      - Pod
    namespaceSelector:
      matchExpressions:
      - key: azure-key-vault-env-injection
        operator: Exists
    scope: Namespaced
  parameters:
    assign:
      value:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
          - ALL
        privileged: false
        readOnlyRootFilesystem: true
        runAsGroup: 10000
        runAsNonRoot: true
        runAsUser: 10000
        seccompProfile:
          type: RuntimeDefault
    pathTests:
    - subPath: "spec.initContainers[name: copy-azurekeyvault-env]"
      condition: MustExist
    - subPath: "spec.initContainers[name: copy-azurekeyvault-env].securityContext"
      condition: MustNotExist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants