-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GA graduation criteria for NodeLocal DNSCache #1351
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -192,7 +192,15 @@ Having the pods query the nodelocal cache introduces a single point of failure. | |
|
||
|
||
## Graduation Criteria | ||
TODO | ||
This feature has been alpha since 1.13 release and beta since 1.15 release. | ||
|
||
Graduation criteria for GA(targeted for 1.18 release): | ||
- Upgrade to a newer CoreDNS version(1.16.x) in [node-cache](https://github.com/kubernetes/dns/pull/328). | ||
johnbelamaric marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Ensure that Kubernetes [e2e tests with NodeLocal DNSCache](https://k8s-testgrid.appspot.com/sig-network-gce#gci-gce-kube-dns-nodecache) are passing. | ||
- Scalability tests with NodeLocal DNSCache enabled, verifying the [HA modes](https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/20190424-NodeLocalDNS-beta-proposal.md#design-details) as well as the regular mode. | ||
- Have N clusters(number TBD) running in production with NodeLocal DNSCache enabled. | ||
- Provide clear documentation on using NodeLocal DNSCache aimed at cluster | ||
operators. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd like to get the documentation updated and improved, so that there is a good-quality Task page, before GA. I'd like to see docs clear enough that someone who has just qualified as CKA can follow that Task page without help, and without needing in to do research / fill in missing details. At the moment I don't think https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/ meets that criterion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As this is an add-on, does it also make sense to document how to remove it? (documenting that would help encourage production use) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the feedback! I submitted the change to docs on how to remove the addon - kubernetes/website@master...prameshj:patch-4 Can you help improve the tasks page with suggestions on what to add? Would adding cluster create commands/kubectl commands to check that node-local-dns pods are running and healthy... make it more useful? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can try to help with documenting this add-on (I've already opened kubernetes/website#17944). I'd want to be clear on intent here. A few things:
From the proposed changes:
What are the prerequisites for that to work? (and are those explained to the reader?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @prameshj, are you planning to make a pull request based on kubernetes/website@master...prameshj:patch-4 ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For this KEP, I think it's enough to add an extra line about providing clear / comprehensible documentation aimed at cluster operators. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I just did.. I thought I already had, thanks for the reminder! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added documentation to the criteria. |
||
## Rollout Plan | ||
This feature will be launched with Alpha support in the first release. Master versions v1.13 and above will deploy the new add-on. Node versions v1.13 and above will have kubelet code to modify pods' resolv.conf. Nodes running older versions will run the nodelocal daemonset, but it will not be used. The user can specify a custom dnsConfig to use this local cache dns server. | ||
|
@@ -202,6 +210,7 @@ This feature will be launched with Alpha support in the first release. Master ve | |
* 2018-10-05 - Creation of the KEP | ||
* 2018-10-30 - Follow up comments and choice of cache agent | ||
* 2018-11-14 - [Changes](https://github.com/kubernetes/kubernetes/pull/70555) to support running NodeLocal DNSCache were merged. | ||
* 2018-11-02 - Added GA graduation criteria | ||
|
||
## Drawbacks [optional] | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just had a thought:
169.254.20.10
is link-local. If Node-local DNS cache relies, by default, on Pods being able to route to169.254.20.10
then that is a dependency on undefined / nonstandard behavior.According to (my reading of) internet standards,
169.254.0.0/16
is only reachable for a client on the same link.I recommend addressing that, somehow, before GA. If the decision is to go against the standard, let's record that explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link-local IP is attached to an interface on the host. You are right, it does not exist on each pod. Since all pod traffic does go through the host network and subject to iptables rules configured, this works. I am not aware of any CNI plugins where this would not work. From what i understand, only 127.0.00/8 traffic does not leave the pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A CNI plugin will need to route traffic to 169.254.20.10, but there's no standardized mechanism to make this happen (because 169.254.20.10 is link-local, and link-local addresses aren't routeable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The packets just need to hit the hostNetwork since the link-local IP address exists on an interface in the node. This creates a route in the "local" routing table of the form:
"local 169.254.20.10 dev nodelocaldns proto kernel scope host src 169.254.20.10"
This is sufficient to deliver the packet to anything listening locally on that address. Isn't that true?
Also, if the k8s cluster is using kube-proxy in iptables mode, the pods can use either the link-local IP or the kube-dns service IP to access node-local-dns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not keen on setting that in stone. Allocating from the link-local range like this and expecting routing to work, even within only one host, just doesn't seem like an approach that can scale.
It's different when a cloud provider does it at the network layer, because there's no concept of those packets being routed. There's a (virtual) link and it turns out that when the instance boots something else is defending
169.254.169.254
or whatever. That works fine.Ideally, I can set up a Kubernetes cluster that puts all Pods on the same link-local network, and that's a valid supported configuration (provided I find a CNI that makes it happen). I realize that
169.254.0.0/16
looks like a convenient range to borrow an address from, and I can see why it's often going to be a pragmatic choice.To be clear: my concern is about making & documenting
169.254.20.10
as a default value. If you have to provide an IPv4 address at deploy time and the documentation suggests picking an address inside169.254.0.0/16
then that isn't a problem and doesn't set any problematic precedent.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, if I understand correctly, we want to tweak the documentation to say:
"Selecting a listen ip for NodeLocal DNSCache from 169.254.20.10/16 is recommended, in our examples we will use 169.254.20.10". Is that correct? I have added this to the website doc PR - kubernetes/website#18716
We can continue the discussion there. From the GA graduation perspective, are the changes here ok to implement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Retrospectively) I think you've taken an appropriate and pragmatic tack. Thanks.