Support NodeLocal DNSCache #1024

davidnuzik · 2021-05-19T21:39:14Z

Node Local Cache work for Rancher 2.6.x

cjellick · 2021-07-14T15:56:47Z

RKE1 did this. @superseb can give more context. Or perhaps @Oats87 can.

If feasible, we should also do in k3s.

manuelbuil · 2021-07-19T09:12:26Z

The upper mentions are wrong. I mixed up this issue with the autoscaler coreDNS one. Sorry for the confusion

manuelbuil · 2021-07-20T09:31:02Z

Tasks:

Add nodelocal dns manifests to coredns rke2-charts
Prepare logic to enable it
Prepare different logic for ipvs and normal iptables
Remove nodelocaldns interface in the rke2-uninstall script
Remove generated nodelocaldns iptables in the rke2-uninstall script
Add new images into rke2 airgap
Document everything

manuelbuil · 2021-07-26T13:59:52Z

How to test:

Enable it via nodelocal.enabled: true
Verify there is a new daemonset node-local-dns and the pods have no errors in the log
Verify there is a node-local-dns configMap binding to 169.254.20.10 and 10.43.0.10
Check that there is a new interface in the node node-local-dns
Run a dummy pod and check that dns resolution works for an internal service (e.g. rke2-metrics-server.kube-system) and for an external service (www.google.com)

rancher-max · 2021-08-12T18:02:47Z

Here's an example helmchart config for the nodelocal enabled:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-coredns
  namespace: kube-system
spec:
  valuesContent: |
    nodelocal:
      enabled: true

galal-hussein · 2021-09-23T19:17:22Z

Validated against master commit 09bb5c2

Was able to successfully enable using HelmChartConfig:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-coredns
  namespace: kube-system
spec:
  valuesContent: |
    nodelocal:
      enabled: true

There was a new daemonset and running pod with no errors.

# kubectl get ds -n kube-system
NAME                            DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
node-local-dns                  1         1         1       1            1           kubernetes.io/os=linux   13m

Confirmed there was a new configmap, which correctly has the data:

# kubectl get configmap -n kube-system node-local-dns -o yaml
apiVersion: v1
data:
  Corefile: |
    cluster.local:53 {
        errors
        cache {
                success 9984 30
                denial 9984 5
        }
        reload
        loop
        bind 169.254.20.10 10.43.0.10
        forward . __PILLAR__CLUSTER__DNS__ {
                force_tcp
        }
        prometheus :9253
        health 169.254.20.10:8080
        }
    in-addr.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.20.10 10.43.0.10
        forward . __PILLAR__CLUSTER__DNS__ {
                force_tcp
        }
        prometheus :9253
        }
    ip6.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.20.10 10.43.0.10
        forward . __PILLAR__CLUSTER__DNS__ {
                force_tcp
        }
        prometheus :9253
        }
    .:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.20.10 10.43.0.10
        forward . __PILLAR__UPSTREAM__SERVERS__ {
                force_tcp
        }
        prometheus :9253
        }
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: rke2-coredns
    meta.helm.sh/release-namespace: kube-system
  creationTimestamp: "2021-09-23T18:28:18Z"
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    app.kubernetes.io/managed-by: Helm
  name: node-local-dns
  namespace: kube-system
  resourceVersion: "642"
  uid: c5a6b6af-8d4f-4ff8-b694-e8747bfed77d

There was a new interface on the node with the expected values:

$ ip addr
...
16: nodelocaldns: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default 
    link/ether c2:1c:2d:03:67:b1 brd ff:ff:ff:ff:ff:ff
    inet 169.254.20.10/32 scope global nodelocaldns
       valid_lft forever preferred_lft forever
    inet 10.43.0.10/32 scope global nodelocaldns
       valid_lft forever preferred_lft forever

Ran a dummy pod: kubectl run tester --image=ranchertest/mytestcontainer and a simple deployment with service:

apiVersion: v1
kind: Service
metadata:
  name: busyb
spec:
  selector:
    app: busy
  clusterIP: None
  ports:
  - name: foo # Actually, no port is needed.
    port: 1234
    targetPort: 1234
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: busydep
spec:
  replicas: 3
  selector:
    matchLabels:
      app: busy
  template:
    metadata:
      labels:
        app: busy
    spec:
      containers:
        - name: busybox
          image: busybox:1.28
          command:
            - sleep
            - "3600"

From the dummy pod, dns resolution and pings are successful:

$ kubectl exec -it tester -- /bin/bash
# nslookup busyb.default.svc.cluster.local 
Server:		10.43.0.10
Address:	10.43.0.10#53

Name:	busyb.default.svc.cluster.local
Address: 10.42.0.12
Name:	busyb.default.svc.cluster.local
Address: 10.42.0.10
Name:	busyb.default.svc.cluster.local
Address: 10.42.0.11

#  ping busyb.default.svc.cluster.local
PING busyb.default.svc.cluster.local (10.42.0.12) 56(84) bytes of data.
64 bytes from 10-42-0-12.busyb.default.svc.cluster.local (10.42.0.12): icmp_seq=1 ttl=63 time=0.083 ms
64 bytes from 10-42-0-12.busyb.default.svc.cluster.local (10.42.0.12): icmp_seq=2 ttl=63 time=0.055 ms
64 bytes from 10-42-0-12.busyb.default.svc.cluster.local (10.42.0.12): icmp_seq=3 ttl=63 time=0.048 ms


# ping google.com
PING google.com (142.251.33.78) 56(84) bytes of data.
64 bytes from sea09s28-in-f14.1e100.net (142.251.33.78): icmp_seq=1 ttl=89 time=8.48 ms
64 bytes from sea09s28-in-f14.1e100.net (142.251.33.78): icmp_seq=2 ttl=89 time=8.33 ms
64 bytes from sea09s28-in-f14.1e100.net (142.251.33.78): icmp_seq=3 ttl=89 time=8.29 ms

# nslookup rke2-metrics-server.kube-system.svc.cluster.local
Server:		10.43.0.10
Address:	10.43.0.10#53

Name:	rke2-metrics-server.kube-system.svc.cluster.local
Address: 10.43.231.21

davidnuzik added this to the Rancher 2.6.x milestone May 19, 2021

davidnuzik added this to To Triage in Development [DEPRECATED] via automation May 19, 2021

davidnuzik moved this from To Triage to Backlog in Development [DEPRECATED] May 19, 2021

cjellick modified the milestones: Rancher 2.6.x, v1.22.0+rke2r1 Jul 14, 2021

cjellick moved this from Backlog to Next Up in Development [DEPRECATED] Jul 14, 2021

cjellick changed the title ~~Node Local Cache~~ Support NodeLocal DNSCache Jul 14, 2021

cjellick added the kind/rke1-feature-parity label Jul 14, 2021

cjellick assigned manuelbuil Jul 14, 2021

This was referenced Jul 19, 2021

Update coredns version rancher/image-build-coredns#5

Merged

Activate dns autoscaler rancher/rke2-charts#116

Merged

manuelbuil mentioned this issue Jul 20, 2021

Add nodelocal manifests rancher/rke2-charts#118

Merged

manuelbuil moved this from Next Up to Working in Development [DEPRECATED] Jul 26, 2021

manuelbuil mentioned this issue Jul 27, 2021

Provide support for nodelocal DNSCache #1466

Merged

manuelbuil moved this from Working to Peer Review in Development [DEPRECATED] Jul 27, 2021

manuelbuil moved this from Peer Review to To Test in Development [DEPRECATED] Aug 2, 2021

bmdepesa assigned rancher-max Aug 2, 2021

bmdepesa added the kind/dev-validation Dev will be validating this issue label Aug 12, 2021

brandond mentioned this issue Aug 19, 2021

[release-1.21] Support NodeLocal DNSCache #1677

Closed

fapatel1 modified the milestones: v1.22.0+rke2r1, v1.22.2+rke2r1 Aug 23, 2021

galal-hussein closed this as completed Sep 23, 2021

Development [DEPRECATED] automation moved this from To Test to Done Issue / Merged PR Sep 23, 2021

rancher-max mentioned this issue Nov 30, 2021

[release-1.20] Support NodeLocal DNSCache #2190

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support NodeLocal DNSCache #1024

Support NodeLocal DNSCache #1024

davidnuzik commented May 19, 2021

cjellick commented Jul 14, 2021

manuelbuil commented Jul 19, 2021

manuelbuil commented Jul 20, 2021 •

edited

manuelbuil commented Jul 26, 2021

rancher-max commented Aug 12, 2021

galal-hussein commented Sep 23, 2021

Support NodeLocal DNSCache #1024

Support NodeLocal DNSCache #1024

Comments

davidnuzik commented May 19, 2021

cjellick commented Jul 14, 2021

manuelbuil commented Jul 19, 2021

manuelbuil commented Jul 20, 2021 • edited

manuelbuil commented Jul 26, 2021

rancher-max commented Aug 12, 2021

galal-hussein commented Sep 23, 2021

manuelbuil commented Jul 20, 2021 •

edited