Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support Node Local DNS Cache #3673

Open
damienwebdev opened this issue May 21, 2023 · 12 comments
Open

[Feature] Support Node Local DNS Cache #3673

damienwebdev opened this issue May 21, 2023 · 12 comments
Labels
feature-request Requested Features

Comments

@damienwebdev
Copy link

damienwebdev commented May 21, 2023

Is your feature request related to a problem? Please describe.
I'm a developer using NodeJS to server-side render frontend applications. I'm attempting to improve the TTFB of my renders, and in the course of doing so I'm seeing ~8ms of DNS latency when using doing DNS lookups. The important thing to know here is that NodeJS does not cache DNS lookups either in-process or between processes (it relies on OS specific functions and caching like getaddrinfo), leading to a higher than expected volume of DNS requests. There are many articles on the topic:

  1. https://httptoolkit.com/blog/configuring-nodejs-dns/
  2. https://adambrodziak.pl/dns-performance-issues-in-kubernetes-cluster
  3. A video by one of the creators of Libuv

Describe the solution you'd like
I would like to leverage Node Local DNS Cache as described by the Kubernetes team.

Describe alternatives you've considered

  1. nodelocaldns aks routing issue #1642
  2. node-local-dns daemonset is automatically deleted #1435
  3. I've also considered implementing keep-alives connections in SSR.
  4. [Feature] Node local DNS #1492

It looks like (in #1492) the AKS team has already considered this and has already done some intense work to improve network capabilities, but I'm confused (and concerned) about why #1492 was closed without implementing the original feature. It looks (from the outside) like this feature was used as a "placeholder" as a fix for a completely different issue.

Can someone clarify why #1492 was closed?

Additionally, @jnoller points out that user-driven attempts to remedy this problem are also subverted by AKS. Can you explain why? Could we get a flag that allows us to switch this to a daemonset? Otherwise, I'm left quite confused and left with slow HTTP requests for a reason that seems beyond me.

@damienwebdev damienwebdev added the feature-request Requested Features label May 21, 2023
@damienwebdev
Copy link
Author

This could be closed, it's possible for users to implement this themselves, but it would be nice to have the AKS team document this specifically for AKS.

@dengliu
Copy link

dengliu commented Oct 4, 2023

Hi @damienwebdev
Have you been able to deploy Node Local DNS to aks?
I tried both the official solution from k8s and the suggested aks solution here, neither of them works on AKS

@timja
Copy link

timja commented Oct 5, 2023

@Neurobion
Copy link

Hi @timja, it has been a few months since your comment and I want to ask if you are still using it without any problems or if you have found something more suitable? Thanks

@artificial-aidan
Copy link

I just implemented this today, seems to be working. A few notes for someone new to this.

In @timja's example, the dns ip is 10.0.0.10, this may not be the case for you. This command can be used to query the ip: kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP}

(source: https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/)

The default memory requests were way too small for my use case, I was seeing 25mb+ of memory used by the node local caches, so make sure you set that correctly, as having a nodelocal pod get OOM killed will result in DNS downtime on that node.

@timja
Copy link

timja commented Feb 7, 2024

it shouldn't get oom killed if there's no limit set, but yes it could probably request more than that if it's needed.

@artificial-aidan
Copy link

If no limit is set then if a node has memory pressure it will be higher priority to get killed if it is using more than its requested memory.

@lomboboo
Copy link

@artificial-aidan @timja
Have you guys figured it out? We also run AKS cluster and after I installed nodelocalcache as per Kubernetes docs (except I had to remove addonmanager.kubernetes.io/mode: Reconcile label) I don't think it is working as I would expect.

When creating new pod in that cluster based on dnsutils image for example and running nslookup google.com we get Server: 10.0.0.10 instead of Server: 169.254.20.10. I would expect to get 10.0.0.10 on the first call and 169.254.20.10 on all calls after that since it should be cached by node-local-dns.

I am curious if it even supposed to work with AKS or is there anything else that has to be done in order for it to work in AKS managed cluster?
Or am I testing it wrongly altogether?

@artificial-aidan
Copy link

I think the way I tested it was to look at DNS queries on the kube-dns metrics. They went way down once nodelocal was working.

@lomboboo
Copy link

lomboboo commented Jun 27, 2024

Thanks for the response.

Can you please elaborate on this a little bit? How did you install nodelocaldns in your AKS?

Did you use curl <coredns-pod-xxx>:9153/metrics to get different metrics? If so, which one did you pay attention to?

@artificial-aidan
Copy link

I use prometheus to scrape all the metrics, don't remember where they came from. But both nodelocal and coredns export the cache hit metric. And you should be able to see the nodelocal metric increasing. I followed the same steps timja did.

@muadnan
Copy link

muadnan commented Aug 29, 2024

Thanks for the response.

Can you please elaborate on this a little bit? How did you install nodelocaldns in your AKS?

Did you use curl <coredns-pod-xxx>:9153/metrics to get different metrics? If so, which one did you pay attention to?

hey, @lomboboo have you figured out it is working as expected or not?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Requested Features
Projects
None yet
Development

No branches or pull requests

7 participants