Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
CRITICAL: Use Debian as a base container instead of Alpine since Alpine causes DNS issues. #1161
So here's the issue...
In some clusters, DNS will not resolve correctly due to Alpine not handling DNS resolution correctly. Alpine is used as a base image for cert-manager.
This is a critical problem as I'm unable to get this to work within my large Kubernetes cluster with Let's Encrypt.
There is a HUGE chain of issues that describe's what's happening. Essentially, Alpine does not resolve the DNS queries correctly and either returns incorrect queries, or (depending if the provider uses Cloudflare), returns them incorrectly.
I'm unfamiliar with Bazel, but it'd be good to change it from Alpine to Debian here:
How to replicate the bug and what happens:
helm install \ --name cert-manager \ --namespace kube-system \ --version v0.5.2 \ stable/cert-manager
Now try to do an nslookup within the cert-manager container:
▶ kubectl exec -it cert-manager-5d5bc6cd7f-fw7dx -n kube-system -- /bin/sh / $ nslookup letsencrypt.org nslookup: can't resolve '(null)': Name does not resolve Name: letsencrypt.org Address 1: 184.108.40.206 ec2-23-23-86-44.compute-1.amazonaws.com
This returns an INCORRECT dns entry. The reasoning behind this can be found in multiple issues: kubernetes/kubernetes#30215 gliderlabs/docker-alpine#8 JiscSD/rdss-arkivum-nextcloud#24 kubernetes/dns#119
Larger projects have also switched over to using Debian instead of Alpine due to an incredible amount of DNS issues: apache/openwhisk#4052
This is due to Alpine not resolving the
/ $ cat /etc/resolv.conf nameserver 10.96.0.10 search kube-system.svc.cluster.local svc.cluster.local cluster.local net options ndots:5 / $
After removing "net" (provided by Kubernetes) from /etc/resolv.conf, DNS now resolves correctly:
/ $ nslookup letsencrypt.org nslookup: can't resolve '(null)': Name does not resolve Name: letsencrypt.org Address 1: 220.127.116.11 a23-195-219-207.deploy.static.akamaitechnologies.com Address 2: 2600:140a:0:384::ce0 g2600-140a-0000-0384-0000-0000-0000-0ce0.deploy.static.akamaitechnologies.com Address 3: 2600:140a:0:3b0::ce0 g2600-140a-0000-03b0-0000-0000-0000-0ce0.deploy.static.akamaitechnologies.com / $ ^C
I highly suggest changing the base image from Alpine (the current one) to Debian in order to resolve these DNS issues as at the moment, cert-manager is incompatible with Let's Encrypt due to DNS issues not being able to resolve correctly with the current Alpine image.
I'd honestly open a PR, but it looks like the Alpine image is being built somewhere else and is pushed to
Here's an article that outlines the issue with Alpine:
Here's an open issue with regards to running Alpine on Kubernetes clusters:
Another open issue with Alpine + Go + DNS dropping:
An open issue with Rancher:
Another open issue on the Alpine repo, which even involved editing an AWS ECS AMI:
There's 6 more projects I found that has the exact same issue, but I'm not going to post any more haha.
Actually, it was an issue on my host, but regardless, Alpine will not take in multiple DNS servers and will not fall-back to another "search" in /etc/resolv.conf
I ended up removing "net" from my host /etc/resolv.conf and it fixed the issue. But regardless, I think we should still switch to Debian
Can't reproduce your issue.
The only thing that doesn't work for me is resolving hosts in the LAN (*.fritz.box) but that seems to be a config issue with coredns rather than with alpine.
Just FYI: cert-manager won't use Alpine's (musl) DNS resolver, it's fully statically compiled and uses Go's built-in resolver. So