Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No automated way to wait on constraint template CRD upgrades before updating constraints #1499

Closed
Boojapho opened this issue Aug 14, 2021 · 7 comments
Assignees
Labels
bug Something isn't working triaged

Comments

@Boojapho
Copy link
Contributor

Boojapho commented Aug 14, 2021

Use case:

I am using a tool like flux or argo to manage deployments into a Kubernetes cluster. The Gatekeeper system is in one Helm deployment and the Gatekeeper constraints are in another Helm deployment with the latter dependent on the former.

On the first deployment, I can use helm with --wait to insure that the Gatekeeper system is up and running and all CRDs are deployed before the helm for the constraints runs.

Let's say I add a field to the CRD for a constraint template. On my next helm upgrade, the CRD must be in place before the constraint that uses that field gets deployed or it will fail due to schema validation. Since Helm sees the Gatekeeper deployment as already up and running with no changes and the ConstraintTemplate resource is deployed, it completes quickly. But, Gatekeeper has not processed the new ConstraintTemplate. The Constraint is then deployed next, but fails because the CRD doesn't have the new field.

There needs to be a way to identify if CRDs are up-to-date with the ConstraintTemplate so the Constraint Helm knows if it should proceed or wait. Some ideas:

  • On upgrade, force the deployment to go not ready until gatekeeper has a chance to re-check all Constraint Templates. This would allow a --wait to wait on it to process everything before continuing. Maybe labels/annotations could do this.
  • Create an API query to gatekeeper to identify if it is in sync or out of sync. Then use an init container to check this in the Container helm before continuing
@Boojapho Boojapho added the bug Something isn't working label Aug 14, 2021
@Boojapho
Copy link
Contributor Author

For reference, I worked around this by adding the chart version to the template labels for the controller pod. Every time I bump the chart version due to a new constraint template, it re-rolls the controller pods (in a rolling update) which will complete after all of the new templates are read in. It is not ideal since the templates may not have changed, but it only takes a few seconds extra. The --wait option will then hold up any future helm charts (e.g. constraints) that are dependent on the changes.

Another option I tried that worked was to create a checksum annotation of all the constraint templates on the controller pods in the deployment. But, this would require upkeep when constraint templates are added, removed, or renamed. I opted to keep it simple.

@maxsmythe
Copy link
Contributor

maxsmythe commented Aug 16, 2021

I'm wondering if status.byPod would be a good fit for this?

status.byPod[].observedGeneration should equal metadata.generation for all pods once all pods have ingested the constraint template, which includes the updating of the constraint CRD.

In practice, any pod showing the correct observedGeneration should be sufficient for the code as currently written, but blocking on all pods is safer.

Here is an example constraint template:

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"templates.gatekeeper.sh/v1beta1","kind":"ConstraintTemplate","metadata":{"annotations":{},"name":"k8srequiredlabels"},"spec":{"crd":{"spec":{"names":{"kind":"K8sRequiredLabels"},"validation":{"openAPIV3Schema":{"properties":{"labels":{"items":{"properties":{"allowedRegex":{"type":"string"},"key":{"type":"string"}},"type":"object"},"type":"array"},"message":{"type":"string"}}}}}},"targets":[{"rego":"package k8srequiredlabels\n\nget_message(parameters, _default) = msg {\n  not parameters.message\n  msg := _default\n}\n\nget_message(parameters, _default) = msg {\n  msg := parameters.message\n}\n\nviolation[{\"msg\": msg, \"details\": {\"missing_labels\": missing}}] {\n  provided := {label | input.review.object.metadata.labels[label]}\n  required := {label | label := input.parameters.labels[_].key}\n  missing := required - provided\n  count(missing) \u003e 0\n  def_msg := sprintf(\"you must provide labels: %v\", [missing])\n  msg := get_message(input.parameters, def_msg)\n}\n\nviolation[{\"msg\": msg}] {\n  value := input.review.object.metadata.labels[key]\n  expected := input.parameters.labels[_]\n  expected.key == key\n  # do not match if allowedRegex is not defined, or is an empty string\n  expected.allowedRegex != \"\"\n  not re_match(expected.allowedRegex, value)\n  def_msg := sprintf(\"Label \u003c%v: %v\u003e does not satisfy allowed regex: %v\", [key, value, expected.allowedRegex])\n  msg := get_message(input.parameters, def_msg)\n}\n","target":"admission.k8s.gatekeeper.sh"}]}}
  creationTimestamp: "2021-08-16T23:01:44Z"
  generation: 1
  name: k8srequiredlabels
  resourceVersion: "1010332"
  uid: d8e7bf94-da3a-4e3a-8b67-cbd6886dd2c2
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        legacySchema: true
        openAPIV3Schema:
          properties:
            labels:
              items:
                properties:
                  allowedRegex:
                    type: string
                  key:
                    type: string
                type: object
              type: array
            message:
              type: string
  targets:
  - rego: |
      package k8srequiredlabels

      get_message(parameters, _default) = msg {
        not parameters.message
        msg := _default
      }

      get_message(parameters, _default) = msg {
        msg := parameters.message
      }

      violation[{"msg": msg, "details": {"missing_labels": missing}}] {
        provided := {label | input.review.object.metadata.labels[label]}
        required := {label | label := input.parameters.labels[_].key}
        missing := required - provided
        count(missing) > 0
        def_msg := sprintf("you must provide labels: %v", [missing])
        msg := get_message(input.parameters, def_msg)
      }

      violation[{"msg": msg}] {
        value := input.review.object.metadata.labels[key]
        expected := input.parameters.labels[_]
        expected.key == key
        # do not match if allowedRegex is not defined, or is an empty string
        expected.allowedRegex != ""
        not re_match(expected.allowedRegex, value)
        def_msg := sprintf("Label <%v: %v> does not satisfy allowed regex: %v", [key, value, expected.allowedRegex])
        msg := get_message(input.parameters, def_msg)
      }
    target: admission.k8s.gatekeeper.sh
status:
  byPod:
  - id: gatekeeper-audit-6445fb87b7-7nd7l
    observedGeneration: 1
    operations:
    - audit
    - mutation-status
    - status
    templateUID: d8e7bf94-da3a-4e3a-8b67-cbd6886dd2c2
  - id: gatekeeper-controller-manager-854d7945bb-r9kds
    observedGeneration: 1
    operations:
    - webhook
    templateUID: d8e7bf94-da3a-4e3a-8b67-cbd6886dd2c2
  - id: gatekeeper-controller-manager-854d7945bb-whs4c
    observedGeneration: 1
    operations:
    - webhook
    templateUID: d8e7bf94-da3a-4e3a-8b67-cbd6886dd2c2
  - id: gatekeeper-controller-manager-854d7945bb-zszwl
    observedGeneration: 1
    operations:
    - webhook
    templateUID: d8e7bf94-da3a-4e3a-8b67-cbd6886dd2c2
  created: true

@stale
Copy link

stale bot commented Jul 23, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Jul 23, 2022
@ritazh ritazh added stale and removed wontfix This will not be worked on labels Aug 10, 2022
@stale stale bot removed the stale label Aug 10, 2022
@stale
Copy link

stale bot commented Oct 11, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 11, 2022
@stale stale bot closed this as completed Oct 25, 2022
@maxsmythe
Copy link
Contributor

Still something we should simplify.

@maxsmythe maxsmythe reopened this Oct 27, 2022
@stale stale bot removed the stale label Oct 27, 2022
@jvossler-cogility
Copy link

I am seeing this issue when doing an initial deploy of gatekeeper. The install is via flattened helm charts (yaml files). Gatekeeper software deploys and all the constraint templates deploy, ALL the constraints fail to deploy due not yet finding the named template for each constraint. The constraints deploy correctly later as they can now access the templates by name.

I can see this issue also happening on an upgrade with any template change and associated constraint change as originally posted. There needs to be a way to ensure all the template changes are deployed and current as a prerequisite for deploying or updating constraints.

@salaxander salaxander self-assigned this Feb 21, 2024
@salaxander
Copy link
Contributor

Thanks for reporting this. Unfortunately this isn't really a problem to be addressed on the Gatekeeper side, but more at the deployment orchestration stage. Going to close this for now, but feel free to add additional comments if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged
Projects
None yet
Development

No branches or pull requests

6 participants