Conversation
35c74be
to
b133925
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to set clusterDNS
in kubelet config when this feature is enabled, so it actually make use of local DNS server.
I think we should also:
- Collect metrics from local DNS.
- Make sure CoreDNS dashboard works properly with this change.
- Ensure HA of this solution. As far as I understand, existing solution will cause DNS outage during update. See https://povilasv.me/kubernetes-node-local-dns-cache/ for more details.
assets/terraform-modules/bare-metal/flatcar-linux/kubernetes/variables.tf
Show resolved
Hide resolved
Oh we don't need to do that. So coredns on each host, will bind on More info: https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/#configuration
Then this should go into the prometheus-operator stack, than in the node-local-dns chart.
Sure once above is taken care of then I will look at it.
I will look into this. |
c2e5e62
to
9413540
Compare
Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
- Add servicemonitor. - Replicate the coredns grafana dashboard. - Add test verifying if the node-local-dns is being scraped. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
9413540
to
3797b39
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still concerned, that upgrading the cluster/controlplane, when it includes local DNS update, will disrupt the DNS resolution on all nodes.
I think we should still consider configurations used in https://github.com/contentful-labs/coredns-nodecache, with set of 2 DaemonSets, so we can avoid the downtime.
Otherwise LGTM (- some suggested code changes)
AFAIK this won't be an issue while upgrading but when the pod is OOMKiled by the kernel. So when the pod starts it creates an interface and a bunch of iptables rules. If the pod is gracefully killed then it will do the cleanup. Hence all the traffiic will start routing to upstream coredns. |
Move the logic to select Kubelet and/or NodeLocalDNS control plane chart into "common" control plane charts in the function CommonControlPlaneCharts. Now the function makes decision based on the struct provided. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
node-local-dns runs on host network. And the node-local-dns metrics port is exposed on 9253, on AWS it needs to be exposed explicitly. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
The default IP for the node-local-dns from the link local IP range is stored in a variable that can be accessed from various locations. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
3797b39
to
33625d4
Compare
charts := platform.CommonControlPlaneCharts(options.UpgradeKubelets) | ||
charts := c.platform.Meta().ControlplaneCharts | ||
if !options.UpgradeKubelets { | ||
charts = removeKubeletChart(charts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very very soft suggestion, more like a thought, we could also write:
charts = removeKubeletChart(charts) | |
charts = withoutKubeletChart(charts) |
Or withoutCharts(charts, []string{platform.KubeletChartName})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
To enable node local dns in your cluster add the following line in the controller config: