Skip to content

Latest commit

 

History

History
73 lines (53 loc) · 3.43 KB

README.md

File metadata and controls

73 lines (53 loc) · 3.43 KB

GitHub release Go Report Card Travis Result

Overview

Using Kubernetes' node cache should solve most DNS issues, there are few reasons why we built a custom version here.

  1. Got OOM killed in our stress test.
  2. Old Kubernetes couldn't use it.
  3. Hard to control its behavior specially if you are using management Kubernetes like GKE.

We added all CoreDNS addons, default to filter AAAA records and more.

Known issues

  • Daemonset rolling update will cause DNS query timeout for few seconds.

DNS releated issues without this

Warning: Network policy

If running with network policy, please see the README for node-local-dns; network policy will likely need to be configured for the node-local-dns agent.

Other caveats

  • If the local-dns process crashes, DNS resolution will not function on that node until it is restarted.

How it works

Normally the local-dns agent:

  • uses an unused IP (typically 169.254.20.10)
  • the kubelet --cluster-dns flag is used to specify that pods should use that IP address (169.254.20.10) as their DNS server
  • CoreDNS runs as a daemonset on every node, configured to listen to the internal IP (169.254.20.10)
  • The local-dns agent configures IP tables rules to avoid conntrack / NAT

In this mode, we instead intercept the existing kube-dns service IP a few things:

  • We configure the local-dns agent to intercept the kube-dns service IP
  • kubelet is already configured to send queries to that service, by default
  • When the local-dns agent configures the kube-dns service IP to avoid conntrack/NAT, this takes precedence over the normal DNS service routing.

Installation

A script is provided, simply run cd ./hack && cp configmap.yaml.sample configmap.yaml && ./install.sh

Removal

Removal is more complicated that installation. We can remove the daemonset, and as part of pod shutdown the local-dns cache should remove the IP interception rules. However, if something goes wrong with the removal, the IP interception rules will remain in place, but the local-dns cache will not be running to serve the intercepted traffic, and DNS lookup will be broken on that node. However, restarting the machine will remove the IP interception rules, so if this is done as part of a cluster update the system will self-heal.

The procedure therefore is:

  • Run cd ./hack && ./uninstall.sh
  • Upgrade cluster

References