Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coredns pods fail to deploy if cluster is initialized without coredns #1328

Closed
brandond opened this issue Jan 21, 2020 · 13 comments
Closed

coredns pods fail to deploy if cluster is initialized without coredns #1328

brandond opened this issue Jan 21, 2020 · 13 comments
Labels
kind/bug Something isn't working
Milestone

Comments

@brandond
Copy link
Contributor

Version:
v1.17.1-rc1+k3s1
./k3s.sh server --no-deploy=coredns

Describe the bug
Adding nodes without coredns manifest prevents coredns from being deployed later. This appears to be because the NodeHosts configmap is only created or updated when nodes are added. If the coredns manifest is added after all nodes have been joined, the NodeHosts entry will never be created and coredns will fail to start
https://github.com/rancher/k3s/blob/master/pkg/node/controller.go#L40

To Reproduce

  1. install k3s server without coredns
  2. copy stock coredns manifest to /var/lib/rancher/k3s/server/manifests/

Expected behavior
coredns receives configuration and starts as usual

Actual behavior

E0120 18:01:29.201824   32534 nestedpendingoperations.go:270] Operation for "\"kubernetes.io/configmap/5eaea202-9251-42c4-b527-03a78086831b-config-volume\" (\"5eaea202-9251-42c4-b527-03a78086831b\")" failed. No retries permitted until 2020-01-20 18:03:31.201771173 -0800 PST m=+337.323654190 (durationBeforeRetry 2m2s). Error: "MountVolume.SetUp failed for volume \"config-volume\" (UniqueName: \"kubernetes.io/configmap/5eaea202-9251-42c4-b527-03a78086831b-config-volume\") pod \"coredns-d798c9dd-dtjnm\" (UID: \"5eaea202-9251-42c4-b527-03a78086831b\") : configmap references non-existent config key: NodeHosts"

Additional context
I was doing this to customize my coredns configuration, since the current k3s server overwrites files on startup.

@brandond
Copy link
Contributor Author

brandond commented Jan 22, 2020

Won't #1345 make this worse? Now if I ask to skip coredns, the NodeHosts ConfigMap key will no longer be dynamically updated as I add nodes. Is there any way I can keep the existing NodeHosts behavior while still being able to customize the CoreDNS Corefile without it getting overwritten on startup?

@erikwilson
Copy link
Contributor

The proper thing to do is probably restart k3s without the --no-deploy=coredns flag.

It sounds like the workflow for modifying the coredns manifest (or manifests in general) is a different issue.

@daniel198609
Copy link

I meet the same problem as this how to solve

@brandond
Copy link
Contributor Author

brandond commented Mar 30, 2020

@daniel198609 if you're using the stock k3s coredns yaml, you need to set the NodeHosts configmap key manually. It should contain an /etc/hosts style list of IPs and hostnames for all k3s nodes:

apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  NodeHosts: |
    10.0.1.20 seago.khaus
    10.0.1.21 maersk.khaus
    10.0.1.22 sealand.khaus
  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          upstream
          fallthrough in-addr.arpa ip6.arpa
        }
        hosts /etc/coredns/NodeHosts {
          ttl 60
          reload 15s
          fallthrough
        }
        prometheus :9153
        forward . 10.0.1.1:53
        cache 30
        loop
        reload
        loadbalance
    }

@m4rcu5
Copy link

m4rcu5 commented Apr 24, 2020

I think I am hitting the issue as well.

I disabled the coredns deployment using --no-deploy=coredns, copied the stock coredns.yml file, substituted the %{CLUSTER_DOMAIN}% variables, and the important change for me, switched to a DaemonSet.
This won't start my containers, getting stuck at:

MountVolume.SetUp failed for volume "config-volume" : configmap references non-existent config key: NodeHosts

If I re-enable the default coredns deployment, I end up having a deployment and my daemonset working, which is somewhat of a workaround.

@brandond
Copy link
Contributor Author

brandond commented Apr 24, 2020

@m4rcu5 see the comment directly above yours. You have to include a NodeHosts entry in the configmap, which doesn't exist in the on-disk manifest since it's created and updated on-demand by k3s when nodes are added to or removed from the cluster.

@m4rcu5
Copy link

m4rcu5 commented Apr 25, 2020

@brandond I was hoping for a more integrated way, so we do not have to change the configmap by hand when adding or removing hosts.
Are there any plans to allow for customizations (maybe Kustomize) the manifests deployed by K3s?

@ekeih
Copy link

ekeih commented May 1, 2021

I am currently replacing the packaged coredns with my own copy of the manifest to increase the replicas and add an anti-affinity.

I am wondering what those node DNS records are used for? I found out that @erikwilson added this two years ago in 31cf2bc but I was unable to figure out if this is just nice to have or if k3s relies on this in some way?
For now I just removed this in my setup and so far I don't see any issues, but this doesn't mean it won't cause some in the future I guess.

@erikwilson @brandond maybe one of you can provide some insights if those DNS records are strictly required or if we can just drop them in custom setups?

Thank you in advance!

@brandond
Copy link
Contributor Author

brandond commented May 1, 2021

It makes it so that nodes can be accessed by hostname from within the cluster, even if the environment they are deployed into does not have dns or lacks dns entries for their hostnames. Many things assume the node hostnames are resolvable, and will break if they are not.

@ekeih
Copy link

ekeih commented May 1, 2021

Thanks for your fast reply :)
Do you have an example what would break? Do you mean internal k3s things or other services in general?

@brandond
Copy link
Contributor Author

brandond commented May 1, 2021

I can't remember off the top of my head. I think metrics-server won't work? Probably other things not bundled with k3s that interact with nodes would not work either, as functioning DNS is a requirement for Kubernetes.

@ekeih
Copy link

ekeih commented May 1, 2021

Ok, thank you for your help! :) I will test this a bit further in my cluster and then probably look for a way to update the configmap accordingly.

@caroline-suse-rancher
Copy link
Contributor

Closing due to age

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
Status: Closed
Development

No branches or pull requests

7 participants