Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mutation failure should not block resource creation #633

Merged
merged 4 commits into from
Jan 16, 2020

Conversation

realshuting
Copy link
Member

@realshuting realshuting commented Jan 16, 2020

This PR fix #627, #636.

Mutation failure is only reported as a violation in this fix. The resource creation would not be blocked if any error occurs.

Further support can be tracked in #639.

@@ -108,13 +109,20 @@ func (ws *WebhookServer) HandleMutation(request *v1beta1.AdmissionRequest, resou
events := generateEvents(engineResponses, (request.Operation == v1beta1.Update))
ws.eventGen.Add(events...)

if isResponseSuccesful(engineResponses) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, previously if mutate rule fails to apply on a resource the rule will be failed and we deny the resource request.
But with this change, if a mutate rule fails then we don't deny the request, we just send the patches.

So scenarios when the JSON patch is incorrect or overlay is invalid, are skipped. As the JSON patch is not generated for these resources, the resource is never patched before being used in HandleValidate.

If the JSON patch on resource fails, then in processResourceWithPatches we return nil resource so the original resource is used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shivdudhani the tradeoff to consider is if we block the resource or allow it to be created and report a violation.

Blocking resource creation seems bad. However, allowing creation of non-complaint resources is also bad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To classify the mutation errors:

  1. A policy mutation rule is invalid.
  2. Error while generating a mutation patch. Could be type mismatch as we do JSON patch on resource after every rule processing. Or some bug or cases not handled while building the JSON patch in engine.Mutate.
  3. The patched resource fails schema validation.

For scenarios 1 & 2, ignoring them will not be ideal. I and @realshuting discussed this, and we could use validationFailureAction for mutation too(currently only operates on Validation rules). If there is some limitation, we can support audit mode only till its addressed.

For scenario 3, using the open API schema checks as implemented in the prototype branch will allow us to handle this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your classification, 1 and 2 make sense. However for:

  1. The patched resource fails schema validation.

How would Kyverno detect this error - it will happen after the data is sent back to the API server, right?

Even if we minimize the chances of an error with schema validation, etc. there always is a chance that something fails during policy processing.

We discussed adding the PolicyError CR to capture any error during processing of policies. Even then, should a mutation rule block the resource or allow it to be created?

If we need to introduce a user configurable action, it should be called mutationFailureAction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going back to an example like this:

{
   "operation" : "POST",
   "url" : "https://10.244.1.242:10242/apis/apps/v1beta1/namespaces/velero/deployments",
   "errorCode" : 400,
   "message" : "admission webhook "nirmata.kyverno.resource.mutating-webhook" denied the request: Resource Deployment/velero/velero failed policy add-safe-to-evict:;rule autogen-annotate-empty-dir (Mutation): failed to apply JSON patches: add operation does not apply: doc is missing path: "/metadata/annotations/cluster-autoscaler.kubernetes.io~1safe-to-evict": missing value",
}

I feel that the best option is to not block the resource creation and create a PolicyError / PolicyViolation.

Its not a bad idea to allow the user to control this.

What will be the values for mutationFailureAction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, this fix would report the mutation error as a policy violation and allow the resource creation.

To make mutationFailureAction configurable, track #639.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Mutation failure should not block the resource creation
3 participants