Scale kube-dns to multiple nodes #2

jakolehm · 2018-03-02T11:19:24Z

No description provided.

SpComb · 2018-03-05T09:26:14Z

Relevant pod template spec parts from the kube-dns deployment:

spec:
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: beta.kubernetes.io/arch
                operator: In
                values:
                - amd64
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - effect: NoSchedule
        key: node-role.kubernetes.io/master

Doesn't seem to be any node selectors/affinities that would limit what nodes the dns pods get scheduled onto... it presumably ends up on the master node because that's the first node that happens to be available. Should just be a matter of PATCHing the .spec.replicas on the existing /apis/extensions/v1beta1/namespaces/kube-system/deployments/kube-dns (or shelling out to kubectl scale).

jakolehm · 2018-03-05T10:20:55Z

I think it should have self anti-affinity, see: kubernetes/kubernetes#57683

SpComb · 2018-03-05T11:31:05Z

FWIW that PR was reverted in kubernetes/kubernetes#59357 due to kubernetes/kubernetes#54164 scaling issues.

Also, that PR did not touch the kube-dns manifest used by kubeadm: https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/addons/dns/manifests.go

SpComb · 2018-03-05T12:21:57Z

kubeadm will update (PUT) the kube-dns deployment on upgrades, which will presumably lose any scaling replicas/affinity changes... so with the kubeadm kube-dns deployment, we would need to PATCH / kubectl scale it again after every upgrade as well?

Alternative (kubernetes/kubernetes#40063 (comment)) is using the Horizontal DNS autoscaling controller with ~~--default-params={"linear":{"min":2}}~~ (kubernetes/kubernetes#40281) --default-params={"linear":{"min":1, "preventSinglePointFailure": true}} (kubernetes-sigs/cluster-proportional-autoscaler#23).

SpComb · 2018-03-05T13:57:58Z

The .spec.template.spec.affinity.podAntiAffinity with preferredDuringSchedulingIgnoredDuringExecution seems to be flawed: if all of the nodes temporarily go down, all of the kube-dns pods might end up getting scheduled back onto the one master node.

Once the other nodes go back online, I don't see what would end up rescheduling the other pod off the master node.

terom@terom-kube-master:~$ kubectl get nodes
NAME                STATUS    ROLES     AGE       VERSION
terom-kube-master   Ready     master    5d        v1.9.2
terom-kube-node1    Ready     <none>    4d        v1.9.2
terom-kube-node2    Ready     <none>    2d        v1.9.3
terom@terom-kube-master:~$ kubectl -n kube-system get deployments/kube-dns -o json | jq .spec.template.spec.affinity.podAntiAffinity
{
  "preferredDuringSchedulingIgnoredDuringExecution": [
    {
      "podAffinityTerm": {
        "labelSelector": {
          "matchExpressions": [
            {
              "key": "k8s-app",
              "operator": "In",
              "values": [
                "kube-dns"
              ]
            }
          ]
        },
        "topologyKey": "kubernetes.io/hostname"
      },
      "weight": 100
    }
  ]
}
terom@terom-kube-master:~$ kubectl -n kube-system get pods -o wide --selector k8s-app=kube-dns
NAME                       READY     STATUS    RESTARTS   AGE       IP          NODE
kube-dns-d9ddc5479-fg9tj   3/3       Running   0          12m       10.40.0.2   terom-kube-master
kube-dns-d9ddc5479-hhdj8   3/3       Running   0          15m       10.40.0.1   terom-kube-master

SpComb · 2018-03-06T09:12:03Z

The pragmatic approach to this issue would be to make the number of DNS replicas a configurable parameter (maybe default to something sensible based on the number of nodes in the config?), and then PATCH the kubeadm-managed deployments/kube-dns to add .spec.replicas and .spec.template.spec.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExcecution. Those will just presumably need to get re-PATCH'd after every kubeadm init/upgrade run.

Long-term I think the best idea would be to replace the problematic kubeadm-managed deployment with a daemonset using node labels for the DNS addon, but that would require more work?

jakolehm · 2018-03-06T11:03:21Z

The pragmatic approach to this issue would be to make the number of DNS replicas a configurable parameter (maybe default to something sensible based on the number of nodes in the config?)

Yes, I think this is the way to go (for now).

Long-term I think the best idea would be to replace the problematic kubeadm-managed deployment with a daemonset using node labels for the DNS addon, but that would require more work?

Long-term solutions probably requires contributions to kubeadm (to make it less hacky)?

jakolehm added the enhancement New feature or request label Mar 2, 2018

jakolehm assigned SpComb Mar 5, 2018

jakolehm added this to the 0.3 milestone Mar 5, 2018

SpComb mentioned this issue Mar 6, 2018

Configure dns replicas #30

Merged

jakolehm closed this as completed in #30 Mar 6, 2018

jamhulk mentioned this issue Apr 20, 2020

Error Installing cfssl #1538

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale kube-dns to multiple nodes #2

Scale kube-dns to multiple nodes #2

jakolehm commented Mar 2, 2018

SpComb commented Mar 5, 2018 •

edited

Loading

jakolehm commented Mar 5, 2018

SpComb commented Mar 5, 2018 •

edited

Loading

SpComb commented Mar 5, 2018 •

edited

Loading

SpComb commented Mar 5, 2018 •

edited

Loading

SpComb commented Mar 6, 2018 •

edited

Loading

jakolehm commented Mar 6, 2018

Scale kube-dns to multiple nodes #2

Scale kube-dns to multiple nodes #2

Comments

jakolehm commented Mar 2, 2018

SpComb commented Mar 5, 2018 • edited Loading

jakolehm commented Mar 5, 2018

SpComb commented Mar 5, 2018 • edited Loading

SpComb commented Mar 5, 2018 • edited Loading

SpComb commented Mar 5, 2018 • edited Loading

SpComb commented Mar 6, 2018 • edited Loading

jakolehm commented Mar 6, 2018

SpComb commented Mar 5, 2018 •

edited

Loading

SpComb commented Mar 5, 2018 •

edited

Loading

SpComb commented Mar 5, 2018 •

edited

Loading

SpComb commented Mar 5, 2018 •

edited

Loading

SpComb commented Mar 6, 2018 •

edited

Loading