-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/{k8s-nameserver,k8s-operator},k8s-operator: add a kube nameserver, make operator deploy it #11919
Conversation
….net DNS names in cluster. Adds a simple nameserver that can respond to A record queries for ts.net DNS names. It can respond to queries from in-memory records, populated from a ConfigMap mounted at /config. It dynamically updates its records as the ConfigMap contents changes. It will respond with NXDOMAIN to queries for any other record types (AAAA to be implemented in the future). It can respond to queries over UDP or TCP. It runs a miekg/dns DNS server with a single registered handler for ts.net domain names. Queries for other domain names will be refused. The intended use of this is: 1) to allow non-tailnet cluster workloads to talk to HTTPS tailnet services exposed via Tailscale operator egress over HTTPS 2) to allow non-tailnet cluster workloads to talk to workloads in the same cluster that have been exposed to tailnet over their MagicDNS names but on their cluster IPs. Updates #10499 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
…ce Definition DNSConfig CRD can be used to configure the operator to deploy kube nameserver (./cmd/k8s-nameserver) to cluster. Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Adds a new reconciler that reconciles DNSConfig resources. If a DNSConfig is deployed to cluster, the reconciler creates kube nameserver resources. This reconciler is only responsible for creating nameserver resources and not for populating nameserver's records. Signed-off-by: Irbe Krumina <irbe@tailscale.com>
… append to static manifests Signed-off-by: Irbe Krumina <irbe@tailscale.com>
cmd/k8s-nameserver/main.go
Outdated
m = m.SetRcode(r, dns.RcodeNameError) | ||
return | ||
} | ||
// TODO (irbekrm): what TTL? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is in-cluster and I assume(?) peers can join/leave the tailnet fairly rapidly, I think a low TTL would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed that comment to state that TTL is currently 0, but may want to look at what's better in future
cmd/k8s-nameserver/main.go
Outdated
m.Answer = append(m.Answer, rr) | ||
} | ||
case dns.TypeAAAA: | ||
// TODO (irbekrm): implement IPv6 support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth doing now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed later, added a comment now to mention that IPv6 Pod CIDR ranges are as of now less common, so should be ok to not do it in this PR
} | ||
|
||
ip4 := make(map[dnsname.FQDN][]net.IP) | ||
defer func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat pattern!
if n.ip4 == nil { | ||
return nil | ||
} | ||
n.mu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can achieve better concurrency here by using a sync.RWMutex
and using n.mu.RLock()
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I was thinking about this earlier. If I undestand it correctly sync.RWMutex
will prefer writers if writers are around. And cluster updates that would result in new writes often tend to come in batches (i.e a user applies an update that creates X number of things that are new tailscale proxies or a cluster is being upgraded, so a lot of things move around and change Pod IPs). With that (and also bc the records are not cached), I'd be worried that any such cluster change would result in MagicDNS names not resolvable for a period of time because all reads are waiting for all writes to finish.
Although it's true that we don't need to lock on reads..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the answer is to allow caching
Or, perhaps we need to add some metrics to this so we can trace what's actually happening
) | ||
|
||
func (a *NameserverReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) { | ||
logger := a.logger.With("dnsConfig", req.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just configure a.logger
itself with "dnsConfig"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logger takes as args key/value pairs where, in this case, key is dnsConfig
(type of the resource) and value will be whatever is the name of the dnsConfig
resource this method is currently reconciling (req.Name
), example output with a different keys/values from here
{"level":"debug","ts":"2024-04-30T10:26:33Z","logger":"service-reconciler","msg":"reconciling statefulset tailscale/ts-kuard-gccxj","service-ns":"default","service-name":"kuard"}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worked in my testing, generally LGTM, just a few suggestions.
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
d88d5e3
to
c2f8b8d
Compare
Thanks for review @oxtoacart , I am going to merge this and rebase #11019 . |
This is #11017 that was already once merged, but I reverted it just before 1.64 release in #11669 as we did not have the bandwidth to review the second part (#11019) and I did not want to release half of a feature.
Below is the original PR description, the contents haven't changed:
This PR is the first part of the work towards enabling support for MagicDNS name resolution from within Kubernetes cluster.
Second part that updates the nameserver config with 'DNS records' is in #11019.
This PR:
DNSConfig
custom resource that can be used to tell the operator to deploy the nameserverDNSConfig
sOperator advertizes the nameserver's
Service
IP address on the DNSConfig status. Users can read it from there to update clusterDNS.Example flow:
DNSConfig
: