Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle errors better in the disruption controller #85553

Open
wants to merge 1 commit into
base: master
from

Conversation

@mortent
Copy link
Member

mortent commented Nov 22, 2019

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
As described in #77383 the behavior of PDBs can be surprising if the label selector for the PDB captures pods from different controllers. This PR does two things to address this. First, it lets the controller handle these situations better by computing the allowed disruptions based on the good pods instead of giving up whenever a pod with an unknown controller is encountered. It also clarifies the events generated in these situations so it is clearer what has happened.

For the example described in the issue, the controller will now compute the allowed disruptions based only on the pods that belongs to the Deployment. So the status of the pdb will be set as if the pod from the CronJob never existed. During an eviction, the pod from the CronJob will be covered by the pdb which means it will affect the number of pods belonging to the PDB that can be disrupted at the same time. But this just means that eviction will take slightly longer than if the pod was not included in the PDB and the number of disrupted pods belonging to the Deployment will never drop below the threshold in the PDB.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/sig apps

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Nov 22, 2019

@mortent: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Nov 22, 2019

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mortent
To complete the pull request process, please assign janetkuo
You can assign the PR to them by writing /assign @janetkuo in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

if isControllerNotFoundError(finderErr) {
dc.recorder.Event(pdb, v1.EventTypeWarning, "ControllerResourceMissing",
fmt.Sprintf("controller resource for pod %q missing: %s", pod.Name, finderErr.Error()))
controllerMatch = true

This comment has been minimized.

Copy link
@tedyu

tedyu Nov 22, 2019

Contributor

controller is not found - why is the match flag set true ?

This comment has been minimized.

Copy link
@mortent

mortent Nov 24, 2019

Author Member

The idea was to create events that would distinguish between the situation where a pod has a controller of a GK that the controller knows how to handle but the resource no longer exists (i.e. pod is orphaned) and the situation where the GK of the controller is not one the controller knows how to handle. I have realized that the way the way scale subresources are handled means we don't actually know if a resource doesn't exist or it exists but does not implement scale. I have updated the PR so it no longer tries to make this distinction.

@mortent mortent force-pushed the mortent:BetterPDBErrorHandling branch from 238500b to 5bccd6a Nov 24, 2019
bavarianbidi added a commit to bavarianbidi/charts that referenced this pull request Dec 2, 2019
* add configurable PDB for minio
* change label for post-install-create-bucket-job because
  until kubernetes/kubernetes#85553 is merged,
  PDB will also detect the minio-create-bucket-job and
  PDB won't work correctly (kubernetes/kubernetes#77383)

Signed-off-by: Mario Constanti <github@constanti.de>
@bavarianbidi bavarianbidi mentioned this pull request Dec 2, 2019
4 of 4 tasks complete
bavarianbidi added a commit to bavarianbidi/charts that referenced this pull request Dec 2, 2019
* add configurable PDB for minio
* change label for post-install-create-bucket-job because
  until kubernetes/kubernetes#85553 is merged,
  PDB will also detect the minio-create-bucket-job and
  PDB won't work correctly (kubernetes/kubernetes#77383)

Signed-off-by: Mario Constanti <github@constanti.de>
bavarianbidi added a commit to bavarianbidi/charts that referenced this pull request Dec 3, 2019
* add configurable PDB for minio
* change label for post-install-create-bucket-job because
  until kubernetes/kubernetes#85553 is merged,
  PDB will also detect the minio-create-bucket-job and
  PDB won't work correctly (kubernetes/kubernetes#77383)

Signed-off-by: Mario Constanti <github@constanti.de>
bavarianbidi added a commit to bavarianbidi/charts that referenced this pull request Dec 9, 2019
* add configurable PDB for minio
* change label for post-install-create-bucket-job because
  until kubernetes/kubernetes#85553 is merged,
  PDB will also detect the minio-create-bucket-job and
  PDB won't work correctly (kubernetes/kubernetes#77383)

Signed-off-by: Mario Constanti <github@constanti.de>
bavarianbidi added a commit to bavarianbidi/charts that referenced this pull request Dec 9, 2019
* add configurable PDB for minio
* change label for post-install-create-bucket-job because
  until kubernetes/kubernetes#85553 is merged,
  PDB will also detect the minio-create-bucket-job and
  PDB won't work correctly (kubernetes/kubernetes#77383)

Signed-off-by: Mario Constanti <github@constanti.de>
bavarianbidi added a commit to bavarianbidi/charts that referenced this pull request Dec 9, 2019
* add configurable PDB for minio
* change label for post-install-create-bucket-job because
  until kubernetes/kubernetes#85553 is merged,
  PDB will also detect the minio-create-bucket-job and
  PDB won't work correctly (kubernetes/kubernetes#77383)

Signed-off-by: Mario Constanti <github@constanti.de>
return
dc.recorder.Event(pdb, v1.EventTypeWarning, "NoControllerRef",
fmt.Sprintf("found no controller ref for pod %q", pod.Name))
continue

This comment has been minimized.

Copy link
@liggitt

liggitt Dec 9, 2019

Member

is getExpectedScale only used in percentage cases? our docs on uncontrolled pods (https://kubernetes.io/docs/tasks/run-application/configure-pdb/#arbitrary-controllers-and-selectors) don't mention PDB is unusable for those pods, only that percentages cannot be used.

Is there a reason we wouldn't treat uncontrolled pods or pods belonging to a controller that does not implement /scale as having an implicit expected scale of 1?

This comment has been minimized.

Copy link
@liggitt

liggitt Dec 9, 2019

Member

I also wonder if minAvailable: "100%", maxUnavailable: "0%", maxUnavailable: 0 should also be specially handled, since they serve to effectively block all evictions

This comment has been minimized.

Copy link
@mortent

mortent Dec 30, 2019

Author Member

So I think we could make these special cases and just set allowedDisruptions to 0 if we encounter any of them. But I'm not sure if we should.
If there happens to be more pods than specified by the scale of the controller(s), we probably should allow a disruption. This is probably a rare situation and something the workload controllers will clean up, but it seems like we should think about it.
Also, if we are considering "short-circuiting" the reconcile loop in these situation and not look at the pods and their controllers at all, we will not have the information needed in the other properties in the PDB status object. These are not used by the Eviction API, but I think we should still try to keep the information correct here.

This comment has been minimized.

Copy link
@mortent

mortent Dec 30, 2019

Author Member

getExpectedScale is only called when maxUnavailable is used or if minAvailable is used with a percentage. I think this is covered by the documentation through the two criteria listed.

I'm not sure it would provide any value if we allow uncontrolled pods to be used with maxUnavailable or minAvailable with a percentage. We would essentially compute the desired scale to be the current number of pods. A minAvailable of 50% would mean that half the pods will always be available for disruption, not matter how many pods are actually running. If I have 10 pods, I will be able to terminate 5 of them. Once that is done, I will be able to terminate another 2 and so on.
It might provide some protection against terminating pods when some of them are not in the Ready condition. We could calculate the scale based on all pods (even if they are not Ready) while we obviously will only use Ready pods for computing the number of actually available pods. I'm a little worried that this will lead to some confusing behavior.

foundController = true
break
}
}
// If the pod has a controllerRef but no matching controller, it means that
// either the controller resource doesn't exist (i.e. we have an orphaned pod)
// or the controller does not support the scale subresource. If this happens,

This comment has been minimized.

Copy link
@liggitt

liggitt Dec 9, 2019

Member

I think there are three cases:

  1. error: controller ownerRef for a group/kind we cannot resolve
  2. error: controller ownerRef that implements /scale but for which a call to /scale failed
  3. non-error?: controller ownerRef we can resolve but which does not implement /scale (we can tell this from discovery docs)

for purposes of determining scale, I would have expected case 3 to be treated like an uncontrolled pod

This comment has been minimized.

Copy link
@mortent

mortent Dec 30, 2019

Author Member

So number 3 is currently being treated as an uncontrolled pod, but this is considered an error for maxUnavailable and minAvailable with percentage. If we have minAvailable with a number, then the controller doesn't call getExpectedScale so we never look at the ownerRef.

@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Dec 10, 2019

It would be helpful to decide/document/test the behavior or derived scale in the following scenarios (it's possible this is already documented somewhere but I had a hard time finding it, and the documentation I did find didn't seem to agree with either the code or the API docs):

PDB inputs:

  • minAvailable: 0
  • minAvailable: <non-zero number>
  • minAvailable: 0%
  • minAvailable: 1-99%
  • minAvailable: 100%
  • maxUnavailable: 0
  • maxUnavailable: <non-zero number>
  • maxUnavailable: 0%
  • maxUnavailable: 1-99%
  • maxUnavailable: 100%

Encountered pods:

  • pod without controller ownerRef
  • pod with controller ownerRef for non-existent group/version/kind
  • pod with controller ownerRef for non-existent instance of a valid group/version/kind
  • pod with controller ownerRef for valid instance of a valid group/version/kind that does not implement the scale subresource
  • pod with controller ownerRef for valid instance of a valid group/version/kind that does implement the scale subresource but which gets an error calling the scale subresource
  • pod with controller ownerRef for valid instance of a valid group/version/kind that implements the scale subresource and successfully fetches the scale subresource
k8s-ci-robot added a commit to helm/charts that referenced this pull request Dec 10, 2019
* add configurable PDB for minio
* change label for post-install-create-bucket-job because
  until kubernetes/kubernetes#85553 is merged,
  PDB will also detect the minio-create-bucket-job and
  PDB won't work correctly (kubernetes/kubernetes#77383)

Signed-off-by: Mario Constanti <github@constanti.de>
@mortent

This comment has been minimized.

Copy link
Member Author

mortent commented Dec 30, 2019

@liggitt
I have created a document that described the current behavior of the PDB controller for the given scenarios: https://docs.google.com/document/d/1ZqNjmZT_dFzkKV7tQawoFe85Rx96K67TaCnTGPzx4eA/edit?usp=sharing

@mortent

This comment has been minimized.

Copy link
Member Author

mortent commented Jan 3, 2020

@mortent mortent added this to GA blocking issues in PodDisruptionBudget Jan 6, 2020
dargolith added a commit to dargolith/charts that referenced this pull request Jan 10, 2020
* add configurable PDB for minio
* change label for post-install-create-bucket-job because
  until kubernetes/kubernetes#85553 is merged,
  PDB will also detect the minio-create-bucket-job and
  PDB won't work correctly (kubernetes/kubernetes#77383)

Signed-off-by: Mario Constanti <github@constanti.de>
arturrez added a commit to arturrez/charts that referenced this pull request Jan 28, 2020
* add configurable PDB for minio
* change label for post-install-create-bucket-job because
  until kubernetes/kubernetes#85553 is merged,
  PDB will also detect the minio-create-bucket-job and
  PDB won't work correctly (kubernetes/kubernetes#77383)

Signed-off-by: Mario Constanti <github@constanti.de>
Signed-off-by: Artur <artur@upbound.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.