Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CoreDNS "soft" nodeAffinity for controller nodes #188

Merged
merged 1 commit into from May 21, 2020

Conversation

dghubble
Copy link
Member

  • Add nodeAffinity to CoreDNS deployment PodSpec to prefer running CoreDNS pods on controllers, while relying on podAntiAffinity for spreading.
  • For single master clusters, running two CoreDNS pods on the master or running one pod on a worker is permissible.
  • Note: Its still possible to end up with CoreDNS pods all running on workers since we only express scheduling preference ("soft"), but unlikely. Plus the motivating scenario (below) is also rare.

Background:

  • CoreDNS replicas are set to the higher of 2 or the number of control plane nodes to (at a minimum) support Deployment updates or pod restarts and match the cluster size (e.g. 5 master/controller nodes likely means a larger cluster, so run 5 CoreDNS replicas)
  • In the past (before v1.14), we required kube-dns (CoreOS predecessor) to run CoreDNS pods on master nodes. With CoreDNS this node selection was relaxed. We'd like a gentler form of it now.

Motivation:

  • On clusters using 100% preemptible/spot workers, it is possible that CoreDNS pods schedule to workers that are all preempted at the same time, causing a loss of cluster internal DNS service until a CoreDNS pod reschedules (1 min). We'd like CoreDNS to prefer controller/master nodes (which aren't preempted) to reduce the possibility of control plane disruption

* Add nodeAffinity to CoreDNS deployment PodSpec to
prefer running CoreDNS pods on controllers, while
relying on podAntiAffinity for spreading.
* For single master clusters, running two CoreDNS pods
on the master or running one pod on a worker is
permissible.
* Note: Its still _possible_ to end up with CoreDNS pods
all running on workers since we only express scheduling
preference ("soft"), but unlikely. Plus the motivating
scenario (below) is also rare.

Background:

* CoreDNS replicas are set to the higher of 2 or the
number of control plane nodes to (at a minimum) support
Deployment updates or pod restarts and match the cluster
size (e.g. 5 master/controller nodes likely means a
larger cluster, so run 5 CoreDNS replicas)
* In the past (before v1.14), we required kube-dns (CoreOS
predecessor) to run CoreDNS pods on master nodes. With
CoreDNS this node selection was relaxed. We'd like a
gentler form of it now.

Motivation:

* On clusters using 100% preemptible/spot workers, it is
possible that CoreDNS pods schedule to workers that are all
preempted at the same time, causing a loss of cluster internal
DNS service until a CoreDNS pod reschedules (1 min). We'd like
CoreDNS to prefer controller/master nodes (which aren't preempted)
to reduce the possibility of control plane disruption
@dghubble dghubble merged commit a83ddbb into master May 21, 2020
@dghubble dghubble deleted the coredns-node-affinity branch May 21, 2020 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant