Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/{k8s-nameserver,k8s-operator},k8s-operator: add a kube nameserver, make operator deploy it #11919

Merged
merged 5 commits into from
Apr 30, 2024

Conversation

irbekrm
Copy link
Contributor

@irbekrm irbekrm commented Apr 29, 2024

This is #11017 that was already once merged, but I reverted it just before 1.64 release in #11669 as we did not have the bandwidth to review the second part (#11019) and I did not want to release half of a feature.
Below is the original PR description, the contents haven't changed:

This PR is the first part of the work towards enabling support for MagicDNS name resolution from within Kubernetes cluster.
Second part that updates the nameserver config with 'DNS records' is in #11019.
This PR:

  • implements a simple nameserver, that reads hosts records from a file and responds to A record DNS queries over UDP and TCP
  • adds a DNSConfig custom resource that can be used to tell the operator to deploy the nameserver
  • Adds a new control loop that deploys/syncs nameserver resources in response to users creating DNSConfigs
  • adds some infra for nameserver (make target etc) and CRD generation

Operator advertizes the nameserver's Service IP address on the DNSConfig status. Users can read it from there to update clusterDNS.
Example flow:

  1. Apply a DNSConfig:
apiVersion: tailscale.com/v1alpha1
kind: DNSConfig 
metadata:
  name: ts-dns
spec:
  nameserver:
    image:
      repo: gcr.io/csi-test-290908/nameserver // official images are not published yet- this is from a registry of mine
      tag: v0.0.64dns
  1. Wait for the DNSConfig's status to be updated:
$ kubectl get dnsconfig ts-dns
NAME     NAMESERVERIP
ts-dns   10.44.11.171
  1. Update cluster DNS to add a stub nameserver for ts.net (the following example is for kube-dns, Corefile has different syntax):
apiVersion: v1
data:
  stubDomains: |
    {
      "ts.net": ["10.44.11.171"]
    }
kind: ConfigMap
metadata:
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
  name: kube-dns
  namespace: kube-system

….net DNS names in cluster.

Adds a simple nameserver that can respond to A record queries for ts.net DNS names.
It can respond to queries from in-memory records, populated from a ConfigMap
mounted at /config. It dynamically updates its records as the ConfigMap
contents changes.
It will respond with NXDOMAIN to queries for any other record types
(AAAA to be implemented in the future).
It can respond to queries over UDP or TCP. It runs a miekg/dns
DNS server with a single registered handler for ts.net domain names.
Queries for other domain names will be refused.

The intended use of this is:
1) to allow non-tailnet cluster workloads to talk to HTTPS tailnet
services exposed via Tailscale operator egress over HTTPS
2) to allow non-tailnet cluster workloads to talk to workloads in
the same cluster that have been exposed to tailnet over their
MagicDNS names but on their cluster IPs.

Updates #10499

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
…ce Definition

DNSConfig CRD can be used to configure
the operator to deploy kube nameserver (./cmd/k8s-nameserver) to cluster.

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Adds a new reconciler that reconciles DNSConfig resources.
If a DNSConfig is deployed to cluster,
the reconciler creates kube nameserver resources.
This reconciler is only responsible for creating
nameserver resources and not for populating nameserver's records.

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
… append to static manifests

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
@irbekrm irbekrm changed the title Irbekrm/kubednsconfig cmd/{k8s-nameserver,k8s-operator},k8s-operator: add a kube nameserver, make operator deploy it Apr 29, 2024
m = m.SetRcode(r, dns.RcodeNameError)
return
}
// TODO (irbekrm): what TTL?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is in-cluster and I assume(?) peers can join/leave the tailnet fairly rapidly, I think a low TTL would be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed that comment to state that TTL is currently 0, but may want to look at what's better in future

m.Answer = append(m.Answer, rr)
}
case dns.TypeAAAA:
// TODO (irbekrm): implement IPv6 support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth doing now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed later, added a comment now to mention that IPv6 Pod CIDR ranges are as of now less common, so should be ok to not do it in this PR

}

ip4 := make(map[dnsname.FQDN][]net.IP)
defer func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat pattern!

if n.ip4 == nil {
return nil
}
n.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can achieve better concurrency here by using a sync.RWMutex and using n.mu.RLock() here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I was thinking about this earlier. If I undestand it correctly sync.RWMutex will prefer writers if writers are around. And cluster updates that would result in new writes often tend to come in batches (i.e a user applies an update that creates X number of things that are new tailscale proxies or a cluster is being upgraded, so a lot of things move around and change Pod IPs). With that (and also bc the records are not cached), I'd be worried that any such cluster change would result in MagicDNS names not resolvable for a period of time because all reads are waiting for all writes to finish.
Although it's true that we don't need to lock on reads..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the answer is to allow caching
Or, perhaps we need to add some metrics to this so we can trace what's actually happening

)

func (a *NameserverReconciler) Reconcile(ctx context.Context, req reconcile.Request) (res reconcile.Result, err error) {
logger := a.logger.With("dnsConfig", req.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just configure a.logger itself with "dnsConfig"?

Copy link
Contributor Author

@irbekrm irbekrm Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logger takes as args key/value pairs where, in this case, key is dnsConfig (type of the resource) and value will be whatever is the name of the dnsConfig resource this method is currently reconciling (req.Name), example output with a different keys/values from here

{"level":"debug","ts":"2024-04-30T10:26:33Z","logger":"service-reconciler","msg":"reconciling statefulset tailscale/ts-kuard-gccxj","service-ns":"default","service-name":"kuard"}

Copy link
Contributor

@oxtoacart oxtoacart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked in my testing, generally LGTM, just a few suggestions.

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
@irbekrm
Copy link
Contributor Author

irbekrm commented Apr 30, 2024

Thanks for review @oxtoacart , I am going to merge this and rebase #11019 .
I am going to think about and experiment with #11919 (comment) a bit more separately (will create an issue once these three initial pieces get merged)

@irbekrm irbekrm merged commit 44aa809 into main Apr 30, 2024
49 checks passed
@irbekrm irbekrm deleted the irbekrm/kubednsconfig branch April 30, 2024 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants