Skip to content
This repository has been archived by the owner on Oct 24, 2023. It is now read-only.

fix: revert PodDisruptionBudget definitions #300

Merged
merged 1 commit into from Jan 12, 2019

Conversation

jackfrancis
Copy link
Member

In practice these are negatively impacting single master cluster cordon/drain scenarios (upgrade/scale)

Reason for Change:

Recently introduced PodDisruptionBudget definitions were preventing scale/upgrade scenarios for (at least) single master node cluster configurations:

time="2019-01-11T00:53:15Z" level=info msg="Node k8s-agent2-13795059-2 has been marked unschedulable." source="scaling command line"
time="2019-01-11T00:53:15Z" level=info msg="2 pods need to be removed/deleted" source="scaling command line"
time="2019-01-11T00:53:15Z" level=info msg="1 pods need to be removed/deleted" source="scaling command line"
time="2019-01-11T00:53:15Z" level=info msg="metrics-server-69b44566d5-vnqzx pod successfully evicted" source="scaling command line"
time="2019-01-11T00:53:15Z" level=info msg="kubernetes-dashboard-69f86d7cd9-kmz5r pod successfully evicted" source="scaling command line"
ERRO[3604] Failed to drain node k8s-agent2-13795059-1, got error Drain did not complete within 1h0m0s 
time="2019-01-11T01:53:15Z" level=error msg="There are pending pods when an error occurred: Drain did not complete within 1h0m0s\n" source="scaling command line"
time="2019-01-11T01:53:15Z" level=error msg="pod/coredns-d5b7bc49d-7d65l\n" source="scaling command line"

Issue Fixed:

Requirements:

Notes:

In practice these are negatively impacting single master cluster cordon/drain scenarios (upgrade/scale)
@jackfrancis
Copy link
Member Author

@sylr FYI

Let's aim to re-introduce this in a way that accounts for single master cluster configuration scenarios. Also, let's spend more time validating that only single master scenarios are what's preventing cordon/drain from finishing on scheduled pods with these PodDisruptionBudget definitions.

@codecov
Copy link

codecov bot commented Jan 11, 2019

Codecov Report

Merging #300 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #300   +/-   ##
=======================================
  Coverage   53.16%   53.16%           
=======================================
  Files          95       95           
  Lines       14244    14244           
=======================================
  Hits         7573     7573           
  Misses       6006     6006           
  Partials      665      665

@mboersma
Copy link
Member

lgtm

@CecileRobertMichon
Copy link
Contributor

/lgtm

@acs-bot
Copy link

acs-bot commented Jan 11, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon, jackfrancis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [CecileRobertMichon,jackfrancis]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jackfrancis jackfrancis added this to In progress in backlog Jan 12, 2019
@jackfrancis jackfrancis merged commit 89d7878 into Azure:master Jan 12, 2019
backlog automation moved this from In progress to Done Jan 12, 2019
@jackfrancis jackfrancis deleted the poddisruption-cordon-drain branch January 12, 2019 00:54
juhacket pushed a commit to juhacket/aks-engine that referenced this pull request Mar 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
backlog
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants