Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS Lookups for cluster-domain (and subdomains) of cluster don't work #2157

Closed
maxboone opened this issue Nov 21, 2021 · 3 comments
Closed

Comments

@maxboone
Copy link

Environmental Info:
RKE2 Version:

rke2 version v1.21.6+rke2r1 (b915fc986e84582458af7131fe7f4e686f2af493)
go version go1.16.6b7

Node(s) CPU architecture, OS, and Version:

Linux zone.hostname.tld 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:

One server (4C/8G) and one agent (4C/16G)

Describe the bug:

Lookups to the node's domain / hostname / fqdn and subdomains do not work, they do not receive a response from the two CoreDNS pods.

Steps To Reproduce:

  • Installed RKE2:

Installed RKE2 using the tarball and using the following configuration-yaml:

write-kubeconfig-mode: "0600"
tls-san:
  - zone.hostname.tld
node-ip: a.b.c.d
advertise-address: a.b.c.d
node-external-ip: x.y.z.v
cluster-domain: zone.hostname.tld
resolv-conf: '/run/systemd/resolve/resolv.conf'

Where a.b.c.d is the internal network, and x.y.z.v the public IP.

Expected behavior:

root@zone:~# kubectl run -i -t busybox --image=busybox:1.28.4 --restart=Never
If you don't see a command prompt, try pressing enter.
/ # nslookup zone.hostname.tld
Server:    10.43.0.10
Address 1: 10.43.0.10 rke2-coredns-rke2-coredns.kube-system.svc.zone.hostname.tld

Name:       zone.hostname.tld
Address 1: x.y.z.v zone.hostname.tld

(or 127.0.1.1 or a.b.c.d)

Actual behavior:

root@zone:~# kubectl run -i -t busybox --image=busybox:1.28.4 --restart=Never
If you don't see a command prompt, try pressing enter.
/ # nslookup zone.hostname.tld
Server:    10.43.0.10
Address 1: 10.43.0.10 rke2-coredns-rke2-coredns.kube-system.svc.zone.hostname.tld

nslookup: can't resolve 'zone.hostname.tld'

and

/ # nslookup node0.zone.hostname.tld
Server:    10.43.0.10
Address 1: 10.43.0.10 rke2-coredns-rke2-coredns.kube-system.svc.zone.hostname.tld

nslookup: can't resolve 'node0.zone.hostname.tld'

Additional context / logs:

In-container /etc/resolv.conf:

search default.svc.zone.hostname.tld svc.zone.hostname.tld zone.hostname.tld
nameserver 10.43.0.10
options ndots:5

CoreDNS logs:

[INFO] 10.42.0.12:43667 - 35592 "AAAA IN zone.hostname.tld.svc.zone.hostname.tld. udp 61 false 512" NXDOMAIN qr,aa,rd 172 0.000105982s
[INFO] 10.42.0.12:58377 - 63764 "A IN zone.hostname.tld.svc.zone.hostname.tld. udp 61 false 512" NXDOMAIN qr,aa,rd 172 0.00382279s
[INFO] 10.42.0.12:45038 - 206 "AAAA IN zone.hostname.tld. udp 37 false 512" NOERROR qr,aa,rd 148 0.000130157s
[INFO] 10.42.0.12:47147 - 6392 "AAAA IN zone.hostname.tld.zone.hostname.tld. udp 57 false 512" NXDOMAIN qr,aa,rd 168 0.000328197s
[INFO] 10.42.0.12:50792 - 27315 "A IN zone.hostname.tld.zone.hostname.tld. udp 57 false 512" NXDOMAIN qr,aa,rd 168 0.000237969s
[INFO] 10.42.0.12:38446 - 43046 "A IN zone.hostname.tld. udp 37 false 512" NOERROR qr,aa,rd 148 0.000186775s

and, for the subdomain

[INFO] 10.42.1.26:48679 - 3 "AAAA IN node0.zone.hostname.tld. udp 43 false 512" NXDOMAIN qr,aa,rd 154 0.000257197s
[INFO] 10.42.1.26:46284 - 4 "AAAA IN node0.zone.hostname.tld.default.svc.zone.hostname.tld. udp 75 false 512" NXDOMAIN qr,aa,rd 186 0.000222705s
[INFO] 10.42.1.26:55694 - 5 "AAAA IN node0.zone.hostname.tld.svc.zone.hostname.tld. udp 67 false 512" NXDOMAIN qr,aa,rd 178 0.000418749s
[INFO] 10.42.1.26:57758 - 6 "AAAA IN node0.zone.hostname.tld.zone.hostname.tld. udp 63 false 512" NXDOMAIN qr,aa,rd 174 0.000183775s
[INFO] 10.42.1.26:51525 - 7 "A IN node0.zone.hostname.tld. udp 43 false 512" NXDOMAIN qr,aa,rd 154 0.000170884s
[INFO] 10.42.1.26:34331 - 8 "A IN node0.zone.hostname.tld.default.svc.zone.hostname.tld. udp 75 false 512" NXDOMAIN qr,aa,rd 186 0.000197147s
[INFO] 10.42.1.26:57777 - 10 "A IN node0.zone.hostname.tld.zone.hostname.tld. udp 63 false 512" NXDOMAIN qr,aa,rd 174 0.000131879s
[INFO] 10.42.1.26:34558 - 9 "A IN node0.zone.hostname.tld.svc.zone.hostname.tld. udp 67 false 512" NXDOMAIN qr,aa,rd 178 0.000484659s
@maxboone
Copy link
Author

I presume that this is a misconfiguration for the networking, as I set the cluster-domain: zone.hostname.tld and this should probably be something like zone-hostname-tld to avoid conflicts.

Nevertheless, I've been struggling with getting this thing running for hours now and am curious about the underlying cause of this issue and wonder if it's related to (the root cause of):

If this is a misconfiguration, I'd like to make a PR / MR to add a warning if this configuration is made.

@maxboone
Copy link
Author

Renaming the cluster to zone-hostname-tld indeed fixed the DNS-issues, using:
kubectl edit configmaps -n kube-system cluster-dns

Still wondering why using the clustername as DNS is bad practice.

@maxboone maxboone changed the title DNS Lookups for hostname (and subdomains) of cluster don't work DNS Lookups for cluster-domain (and subdomains) of cluster don't work Nov 21, 2021
@brandond
Copy link
Contributor

kube-dns creates synthetic records for everything under the cluster-domain. It does not merge them with any records that might exist on your existing DNS server. If you try to set zone.hostname.tld as both the cluster domain, and as the hostname for the load-balancer that points at the cluster members, you won't be able to resolve the load-balancer address from within the cluster as it's shadowed by coredns's cluster domain records.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants