Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node-local DNS cache support #542

Closed
mumoshu opened this issue Feb 14, 2019 · 14 comments · Fixed by #550
Closed

Node-local DNS cache support #542

mumoshu opened this issue Feb 14, 2019 · 14 comments · Fixed by #550

Comments

@mumoshu
Copy link
Contributor

mumoshu commented Feb 14, 2019

TL;DR; I'd like to add a new configuration key for nodegroups to specify the value for the kubelet --cluster-dns flag.

Why do you want this feature?

Currently both internal and external DNS lookups are processed by the CoreDNS service deployed by EKS. This isn't ideal from a reliability perspective.

An internal lookup is for resolving cluster-internal names of Kubernetes services. It may fail in various conditions, like:

  • kube-proxy failed on either (1) the node on which your pod sent the DNS query is running or (2) the node on which the CoreDNS pods are running
  • kube-dns(CoreDNS) pod failed
  • Kubernetes endpoint-controller failed
  • Other node-to-node communication failure(EC2 or VPC issue)

An external lookup is for resolving names managed external to the cluster, like RDS DB clusters/instances, SQS endpoints, S3, your own services served via Route 53. It may fail in conditions like:

  • All the above conditions
  • The Amazon DNS in your VPC failed
  • Route 53 failed(Applied in a case that the name is served by it)

What feature/behavior/change do you want?

The only thing I want to propose for eksctl is - Add an ability to override the value for the --cluster-dns flag passed to the kubelet.

It allows us to deploy a node-local dns cache, that is proposed to resolve this issue. You can find the upstream proposal at https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/0030-nodelocal-dns-cache.md. It is implemented in kubernetes/kubernetes#70555.

Seeing how it works, and the implementation, only thing we can't do it outside of eksctl is passing a "magic" IP address that forwards any DNS lookups to the node-local cache, to kubelet's --cluster-dns https://github.com/kubernetes/kubernetes/blob/a3877b1776cc55f5a32103d7a072a73e18c3d939/hack/local-up-cluster.sh#L704.

As we're leaning towards exposing new configurations only to config files, I would be glad if it adds a new field named e.g. clusterDNS. For example, it will be specified like:

nodeGroups:
- name: nodegroup1
  clusterDNS: 169.254.20.10
  # snip
@mumoshu
Copy link
Contributor Author

mumoshu commented Feb 14, 2019

Btw, I believe the effectiveness of the node-local DNS cache, as I've been using it for more than a year in kube-aws kubernetes-retired/kube-aws#792.

Implementation-wise, kube-aws uses dnsmasq daemonset as a node-local cache, and the upstream uses CoreDNS daemonset instead, but for the same purpose.

@mumoshu
Copy link
Contributor Author

mumoshu commented Feb 14, 2019

mumoshu added a commit to mumoshu/eksctl that referenced this issue Feb 18, 2019
Add a new field named `clusterDNS` that accepts the IP address to the DNS server used for all the internal/external DNS lookups a.k.a the `--cluster-dns` flag of `kubelet`.

```yaml
nodeGroups:
- name: nodegroup1
  clusterDNS: 169.254.20.10
  # snip
```

This, in combination with `k8s-dns-node-cache` deployed as a daemonset on your cluster, allows all the DNS lookups from your pods to firstly routed to the node-local DNS server, which adds more reliability.

See the [cluster/addons/dns/nodelocaldns] in the upstream repository for more details.

A concrete steps to enable node-local DNS would look like the below:

- Decide the which IP addr to be used for binding the node-local DNS. Typically this is `169.254.20.10`
- Add `clusterDNS: 169.254.20.10` to your nodegroup in the cluster config
- Deploy [nodelocaldns.yaml](https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml), replacing:
  `__PILLAR__LOCAL__DNS__` with `169.254.169.254`, `__PILLAR__DNS__DOMAIN__` with `cluster.local`, `__PILLAR__DNS__SERVER__` with [`10.100.0.10` or `172.20.0.10`](https://github.com/weaveworks/eksctl/blob/master/pkg/nodebootstrap/userdata.go#L87-L94) according to your VPC CIDR

Resolves eksctl-io#542
mumoshu added a commit to mumoshu/eksctl that referenced this issue Feb 18, 2019
Add a new field named `clusterDNS` that accepts the IP address to the DNS server used for all the internal/external DNS lookups a.k.a the `--cluster-dns` flag of `kubelet`.

```yaml
nodeGroups:
- name: nodegroup1
  clusterDNS: 169.254.20.10
  # snip
```

This, in combination with `k8s-dns-node-cache` deployed as a daemonset on your cluster, allows all the DNS lookups from your pods to firstly routed to the node-local DNS server, which adds more reliability.

See the [cluster/addons/dns/nodelocaldns](https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/nodelocaldns) in the upstream repository for more details.

A concrete steps to enable node-local DNS would look like the below:

- Decide the which IP addr to be used for binding the node-local DNS. Typically this is `169.254.20.10`
- Add `clusterDNS: 169.254.20.10` to your nodegroup in the cluster config
- Deploy [nodelocaldns.yaml](https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml), replacing:
  `__PILLAR__LOCAL__DNS__` with `169.254.169.254`, `__PILLAR__DNS__DOMAIN__` with `cluster.local`, `__PILLAR__DNS__SERVER__` with [`10.100.0.10` or `172.20.0.10`](https://github.com/weaveworks/eksctl/blob/master/pkg/nodebootstrap/userdata.go#L87-L94) according to your VPC CIDR

Resolves eksctl-io#542
mumoshu added a commit to mumoshu/eksctl that referenced this issue Feb 18, 2019
Add a new field named `clusterDNS` that accepts the IP address to the DNS server used for all the internal/external DNS lookups a.k.a the `--cluster-dns` flag of `kubelet`.

```yaml
nodeGroups:
- name: nodegroup1
  clusterDNS: 169.254.20.10
  # snip
```

This, in combination with `k8s-dns-node-cache` deployed as a daemonset on your cluster, allows all the DNS lookups from your pods to firstly routed to the node-local DNS server, which adds more reliability.

The configuration key `clusterDNS` is intentionally made per-nodegroup, not per-cluster, so that you can selectively use the node-local DNS. It, in combination with the proper use of node labels/taints, allows you to test the node-local DNS in only a subet of your workload.
It would also be nice to add `clusterDNS` as a cluster-level config key later. But I believe it isn't a must-have in this change.

See the [cluster/addons/dns/nodelocaldns](https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/nodelocaldns) in the upstream repository for more details.

A concrete steps to enable node-local DNS would look like the below:

- Decide the which IP addr to be used for binding the node-local DNS. Typically this is `169.254.20.10`
- Add `clusterDNS: 169.254.20.10` to your nodegroup in the cluster config
- Deploy [nodelocaldns.yaml](https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml), replacing:
  `__PILLAR__LOCAL__DNS__` with `169.254.169.254`, `__PILLAR__DNS__DOMAIN__` with `cluster.local`, `__PILLAR__DNS__SERVER__` with [`10.100.0.10` or `172.20.0.10`](https://github.com/weaveworks/eksctl/blob/master/pkg/nodebootstrap/userdata.go#L87-L94) according to your VPC CIDR

Resolves eksctl-io#542
mumoshu added a commit to mumoshu/eksctl that referenced this issue Feb 18, 2019
Add a new field named `clusterDNS` that accepts the IP address to the DNS server used for all the internal/external DNS lookups a.k.a the `--cluster-dns` flag of `kubelet`.

```yaml
nodeGroups:
- name: nodegroup1
  clusterDNS: 169.254.20.10
  # snip
```

This, in combination with `k8s-dns-node-cache` deployed as a daemonset on your cluster, allows all the DNS lookups from your pods to firstly routed to the node-local DNS server, which adds more reliability.

The configuration key `clusterDNS` is intentionally made per-nodegroup, not per-cluster, so that you can selectively use the node-local DNS. It, in combination with the proper use of node labels/taints, allows you to test the node-local DNS in only a subet of your workload.
It would also be nice to add `clusterDNS` as a cluster-level config key later. But I believe it isn't a must-have in this change.

See the [cluster/addons/dns/nodelocaldns](https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/nodelocaldns) in the upstream repository for more details.

A concrete steps to enable node-local DNS would look like the below:

- Decide the which IP addr to be used for binding the node-local DNS. Typically this is `169.254.20.10`
- Add `clusterDNS: 169.254.20.10` to your nodegroup in the cluster config
- Deploy [nodelocaldns.yaml](https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml), replacing:
  `__PILLAR__LOCAL__DNS__` with `169.254.169.254`, `__PILLAR__DNS__DOMAIN__` with `cluster.local`, `__PILLAR__DNS__SERVER__` with [`10.100.0.10` or `172.20.0.10`](https://github.com/weaveworks/eksctl/blob/master/pkg/nodebootstrap/userdata.go#L87-L94) according to your VPC CIDR

Resolves eksctl-io#542
@mumoshu mumoshu changed the title Node-local DNS cache support wip: Node-local DNS cache support Feb 18, 2019
@mumoshu mumoshu changed the title wip: Node-local DNS cache support Node-local DNS cache support Feb 18, 2019
D3nn pushed a commit that referenced this issue Feb 21, 2019
Add a new field named `clusterDNS` that accepts the IP address to the DNS server used for all the internal/external DNS lookups a.k.a the `--cluster-dns` flag of `kubelet`.

```yaml
nodeGroups:
- name: nodegroup1
  clusterDNS: 169.254.20.10
  # snip
```

This, in combination with `k8s-dns-node-cache` deployed as a daemonset on your cluster, allows all the DNS lookups from your pods to firstly routed to the node-local DNS server, which adds more reliability.

The configuration key `clusterDNS` is intentionally made per-nodegroup, not per-cluster, so that you can selectively use the node-local DNS. It, in combination with the proper use of node labels/taints, allows you to test the node-local DNS in only a subet of your workload.
It would also be nice to add `clusterDNS` as a cluster-level config key later. But I believe it isn't a must-have in this change.

See the [cluster/addons/dns/nodelocaldns](https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/nodelocaldns) in the upstream repository for more details.

A concrete steps to enable node-local DNS would look like the below:

- Decide the which IP addr to be used for binding the node-local DNS. Typically this is `169.254.20.10`
- Add `clusterDNS: 169.254.20.10` to your nodegroup in the cluster config
- Deploy [nodelocaldns.yaml](https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml), replacing:
  `__PILLAR__LOCAL__DNS__` with `169.254.169.254`, `__PILLAR__DNS__DOMAIN__` with `cluster.local`, `__PILLAR__DNS__SERVER__` with [`10.100.0.10` or `172.20.0.10`](https://github.com/weaveworks/eksctl/blob/master/pkg/nodebootstrap/userdata.go#L87-L94) according to your VPC CIDR

Resolves #542
@StevenACoffman
Copy link
Contributor

StevenACoffman commented Apr 18, 2019

@mumoshu What are your thoughts on the nodelocaldns addon? https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/nodelocaldns

@StevenACoffman
Copy link
Contributor

StevenACoffman commented Apr 18, 2019

Ah, I am sorry, I missed that you already did this. Quoting your commit message below:

Add a new field named clusterDNS that accepts the IP address to the DNS server used for all the internal/external DNS lookups a.k.a the --cluster-dns flag of kubelet.

nodeGroups:
- name: nodegroup1
  clusterDNS: 169.254.20.10
  # snip

This, in combination with k8s-dns-node-cache deployed as a daemonset on your cluster, allows all the DNS lookups from your pods to firstly routed to the node-local DNS server, which adds more reliability.

The configuration key clusterDNS is intentionally made per-nodegroup, not per-cluster, so that you can selectively use the node-local DNS. It, in combination with the proper use of node labels/taints, allows you to test the node-local DNS in only a subet of your workload.
It would also be nice to add clusterDNS as a cluster-level config key later. But I believe it isn't a must-have in this change.

See the cluster/addons/dns/nodelocaldns in the upstream repository for more details.

A concrete steps to enable node-local DNS would look like the below:

  • Decide the which IP addr to be used for binding the node-local DNS. Typically this is 169.254.20.10
  • Add clusterDNS: 169.254.20.10 to your nodegroup in the cluster config
  • Deploy nodelocaldns.yaml, replacing:
    __PILLAR__LOCAL__DNS__ with 169.254.20.10, __PILLAR__DNS__DOMAIN__ with cluster.local, __PILLAR__DNS__SERVER__ with 10.100.0.10 or 172.20.0.10 according to your VPC CIDR
    See local-up-cluster.sh:
  sed -i -e "s/__PILLAR__DNS__DOMAIN__/${KUBE_DNS_NAME:-cluster.local}/g" nodelocaldns.yaml
  sed -i -e "s/__PILLAR__DNS__SERVER__/${KUBE_DNS_SERVER_IP:-10.0.0.10}/g" nodelocaldns.yaml
  sed -i -e "s/__PILLAR__LOCAL__DNS__/${KUBE_LOCAL_DNS_IP:-169.254.20.10}/g" nodelocaldns.yaml

@StevenACoffman
Copy link
Contributor

Ref: #550 (comment)

austinorth referenced this issue in kubernetes/kubernetes Oct 16, 2019
With this change, node-local-dns pods will mount the kube-dns configmap
and update the Corefile with the stubDomain config.
@ghostsquad
Copy link

I apologize for resurrecting this thread, but the nodelocaldns.yaml file actually has 5 variables which, and I'm confused by the difference between DNS__SERVER and CLUSTER__DNS

__PILLAR__DNS__DOMAIN__ == cluster.local
__PILLAR__DNS__SERVER__ == ??
__PILLAR__LOCAL__DNS__ == 169.254.20.10
__PILLAR__CLUSTER__DNS__ == <ClusterIP of Kube/CoreDNS service, e.g 172.20.0.10>
__PILLAR__UPSTREAM__SERVERS__ == /etc/resolv.conf

@kam1kaze
Copy link

@ghostsquad according to kubernetes/kubernetes#84383

We have the following variables in the yaml:
__PILLAR__DNS__SERVER__ - set to kube-dns service IP.
__PILLAR__LOCAL__DNS__ - set to the link-local IP(169.254.20.10 by default).
__PILLAR__DNS__DOMAIN__ - set to the cluster domain(cluster.local by default).

The following variables will be set by the node-cache images - k8s.gcr.io/k8s-dns-node-cache:1.15.6 or later.
The values will be determined by reading the kube-dns configMap for custom
Upstream server configuration.
__PILLAR__CLUSTER__DNS__ - Upstream server for in-cluster queries.
__PILLAR__UPSTREAM__SERVERS__ - Upstream servers for external queries.

@prameshj
Copy link

Also, we are making the listen ip address for the nodelocaldns cache to be both the kube-dns service ip and the link-local ip. So sending requests on either ip will get a response from the cache instance. If we use kube-dns service ip as the listen ip for the cache, we need a different ip for the cache to talk to kube-dns/coreDNS in case of cache misses. That is why we introduced "PILLAR__CLUSTER__DNS" A new service will be created with the same selectors as kube-dns. The clusterIP of this service will be filled in as "PILLAR__CLUSTER__DNS".

@prameshj
Copy link

Using kube-dns service ip as the listen ip for the cache will not work in IPVS clusters. This is because IPVS creates its own interface and binds all the service IPs there. node-local-dns interface will not be able to bind that ip again.

@stafot
Copy link

stafot commented Jan 14, 2020

@prameshj

Also, we are making the listen ip address for the nodelocaldns cache to be both the kube-dns service ip and the link-local ip. So sending requests on either ip will get a response from the cache instance. If we use kube-dns service ip as the listen ip for the cache, we need a different ip for the cache to talk to kube-dns/coreDNS in case of cache misses. That is why we introduced "PILLAR__CLUSTER__DNS" A new service will be created with the same selectors as kube-dns. The clusterIP of this service will be filled in as "PILLAR__CLUSTER__DNS".

Could you elaborate more this? Is it safe to set in both of them (PILLAR__CLUSTER__DNS, PILLAR__UPSTREAM__SERVERS) 172.20.0.10?

@prameshj
Copy link

PILLAR__CLUSTER__DNS, PILLAR__UPSTREAM__SERVERS will be set by the node-cache image. It uses this config map yaml, substitutes these 2 variables and generates the corefile to use. We don't want both these set to kube-dns service IP. Is there a reason you want to set them to this value ?

@stafot
Copy link

stafot commented Jan 14, 2020

Thanks. No I wasn't sure if it can be left as is for my use case (aws eks), because the existing documentation is out of date and uses older version of nodelocaldns manifest, without the stub implementation.

@bkruger99
Copy link

@prameshj

PILLAR__CLUSTER__DNS, PILLAR__UPSTREAM__SERVERS will be set by the node-cache image. It uses this config map yaml, substitutes these 2 variables and generates the corefile to use. We don't want both these set to kube-dns service IP. Is there a reason you want to set them to this value ?

Unless using ipvs, correct? The docs say if you use ipvs you need to set the PILLAR__CLUSTER__DNS ?
Is that still the case? (also using eks, but getting crashing - [FATAL] Error parsing flags - Invalid localip specified - "", Exiting).. thanks.

@prameshj
Copy link

Could you reference the docs that say this? PILLAR__CLUSTER__DNS should be set to kube-dns service IP in IPVS mode.
https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/#configuration does say to do this:

If kube-proxy is running in IPVS mode:

 sed -i "s/__PILLAR__LOCAL__DNS__/$localdns/g; s/__PILLAR__DNS__DOMAIN__/$domain/g; s/__PILLAR__DNS__SERVER__//g; s/__PILLAR__CLUSTER__DNS__/$kubedns/g" nodelocaldns.yaml
In this mode, node-local-dns pods listen only on <node-local-address>. 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants