New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bind coredns containers to linux nodes to avoid Windows scheduling #69940

Merged
merged 1 commit into from Jan 23, 2019

Conversation

@MarcPow
Copy link
Contributor

MarcPow commented Oct 17, 2018

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #69773

Special notes for your reviewer:

This prevents scheduling attempts that are sure to fail on Windows agents. Given the current set of Windows issues, scheduling attempts may sometimes hang and prevent DNS from coming up cleanly in a cluster.

Release note:

CoreDNS is only officially supported on Linux at this time.  As such, when kubeadm is used to deploy this component into your kubernetes cluster, it will be restricted (using nodeSelectors) to run only on nodes with that operating system. This ensures that in clusters which include Windows nodes, the scheduler will not ever attempt to place CoreDNS pods on these machines, reducing setup latency and enhancing initial cluster stability.
@MarcPow

This comment has been minimized.

Copy link
Contributor Author

MarcPow commented Oct 18, 2018

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug and removed needs-kind labels Oct 18, 2018

@MarcPow

This comment has been minimized.

Copy link
Contributor Author

MarcPow commented Oct 18, 2018

/sig Windows

This is a micro-improvement to Windows clusters at startup time.

@MrHohn

This comment has been minimized.

Copy link
Member

MrHohn commented Oct 18, 2018

@chrisohaver @rajansandeep do you mind taking a look? Thanks.

@chrisohaver

This comment has been minimized.

Copy link
Contributor

chrisohaver commented Oct 18, 2018

Is CoreDNS the only core k8s component that cant run on windows?

@MarcPow

This comment has been minimized.

Copy link
Contributor Author

MarcPow commented Oct 18, 2018

Is CoreDNS the only core k8s component that cant run on windows?

I don't have sufficient expertise (yet) to comment on that. I just happened to get bitten by this one when setting a cluster up on a slightly older environment in Azure (where a combination of [seperately filed] bugs in the Docker EE + Azure CNI stacks led to the container to get 'stuck' on a particular node), leaving the cluster without DNS.

@chrisohaver
Copy link
Contributor

chrisohaver left a comment

Makes sense to me. CoreDNS can't run on windows nodes currently.

The change should be proposed in other places too...

  • kubernetes/kubeadm (has it's own copy of the coredns manifests)
  • coredns/deployment (intended to be the authoritative k8s deployment default configuration)
@rajansandeep

This comment has been minimized.

Copy link
Member

rajansandeep commented Oct 19, 2018

The change should be proposed in other places too...

@chrisohaver I'll update them in other install tools and in coredns/deployment
/ok-to-test

@MarcPow

This comment has been minimized.

Copy link
Contributor Author

MarcPow commented Oct 21, 2018

/retest

@MarcPow

This comment has been minimized.

Copy link
Contributor Author

MarcPow commented Oct 21, 2018

Test failures related to pre-existing flaky test: #66542

@MrHohn

This comment has been minimized.

Copy link
Member

MrHohn commented Oct 22, 2018

/approve

@MarcPow

This comment has been minimized.

Copy link
Contributor Author

MarcPow commented Oct 22, 2018

/assign @luxas
/assign @fabriziopandini

@timothysc

This comment has been minimized.

Copy link
Member

timothysc commented Nov 14, 2018

/hold

@@ -237,6 +237,8 @@ spec:
operator: Exists
- key: {{ .MasterTaintKey }}
effect: NoSchedule
nodeSelector:
beta.kubernetes.io/os: linux

This comment has been minimized.

@timothysc

timothysc Nov 14, 2018

Member

Where are the docs that outline this is how we are planning todo hostOS steering? It's weird to use labels for things that need to be enforced.

This comment has been minimized.

@MarcPow

MarcPow Nov 14, 2018

Author Contributor

The os node labels are automatically injected.

But if you're suggesting this should be less manual, that's a fair point - I'd love to see the scheduler to inspect the relevant images and dynamically impose these constraints. But in the absence of such a feature, this seems like a reasonable place to start.

This comment has been minimized.

@neolit123

neolit123 Nov 14, 2018

Member

probably best to have an issue in the k/kubeadm repo about this and add a TODO comment in the code if this change is to make it in.

This comment has been minimized.

@timothysc

timothysc Nov 14, 2018

Member

@brendanburns - ping! What's the policy you are using for denoting windows nodes? and is there a KEP somewhere?

/cc @jbeda

@chrisohaver

This comment has been minimized.

Copy link
Contributor

chrisohaver commented Nov 14, 2018

CoreDNS does not work on Windows yet

To clarify, CoreDNS itself works fine on Windows. But we don't compile a windows executable into the CoreDNS docker manifest (AFAIK neither does kube-dns, so we probably want to add the node selector there too).

As I understand it, there are some hurdles in generally getting K8s cluster DNS on Windows (not coredns specific). A significant one is that Windows systems don't have an /etc/resolv.conf, and kubelet is geared toward assuming all pods do. So there needs to be some way of informing windows Pods of the DNS policy.

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented Nov 14, 2018

To clarify, CoreDNS itself works fine on Windows.

there is a new WG in k8s that is currently discussing how multi-platform images are to be build and tagged.

@k8s-ci-robot k8s-ci-robot requested a review from jbeda Nov 14, 2018

@chrisohaver

This comment has been minimized.

Copy link
Contributor

chrisohaver commented Nov 14, 2018

IMO, If it makes sense to add this node selector for the CoreDNS deployment manifests, it should also be added to the kube-dns deployment manifests. AFAIK, both have the same issue.

@timothysc timothysc removed the approved label Nov 14, 2018

@timothysc

This comment has been minimized.

Copy link
Member

timothysc commented Nov 14, 2018

I'm taking off approval, we don't have well defined policies on heterogeneous clustering that I'm aware of.

@daschott

This comment has been minimized.

Copy link

daschott commented Nov 14, 2018

There are no core-DNS pods based off of windowsservercore container images, so I do think we need a way to schedule to the right OS. The current thoughts are captured here:
https://docs.google.com/document/d/1XLs8Mbz1-xOIiDW9XSSuhx9fshpxJM1NDD1a0oVbzfc/edit

and I'm sure it will be discussed again in future SIG-Windows meetings.

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented Nov 14, 2018

what is preventing the workaround of updating the deployment with amended node selector toleration?
in kubeadm we had reports that CNCF clusters work with this approach, but it was a slightly different case.

making a node selector OS bound makes sense only as a temporary measure.
cc @kubernetes/sig-windows-misc

@timothysc

This comment has been minimized.

Copy link
Member

timothysc commented Nov 14, 2018

@daschott This requires a KEP and affects multiple SIGs.
There needs to be:

  • KEP outlining long term strategy
  • Buy-in from both sig-cluster-lifecycle and sig-node
  • Test automation!!!!
  • etc.
@michmike

This comment has been minimized.

Copy link

michmike commented Nov 27, 2018

cc @PatrickLang for our RuntimePolicy discussion

@PatrickLang

This comment has been minimized.

Copy link
Contributor

PatrickLang commented Jan 11, 2019

@timothysc What should the KEP be titled? I'm trying to understand the scope of what you're looking for.

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented Jan 11, 2019

going back to @chrisohaver 's comment.

As I understand it, there are some hurdles in generally getting K8s cluster DNS on Windows (not coredns specific). A significant one is that Windows systems don't have an /etc/resolv.conf, and kubelet is geared toward assuming all pods do. So there needs to be some way of informing windows Pods of the DNS policy.

it feels to me that we might want to get a DNS solution working on Windows eventually.
node selectors seem like a temporary solution.

@PatrickLang

This comment has been minimized.

Copy link
Contributor

PatrickLang commented Jan 11, 2019

Dns works on Windows pods and is a function of CNI doing the right thing.

CoreDns is used as a Kubernetes service. Windows pods can simply do dns lookups using that service IP and it will route to a Linux node running a matching pod for that service. Because CoreDns doesn't have a Windows container published, it needs the nodeselector.

@PatrickLang PatrickLang added this to In Review in SIG-Windows Jan 11, 2019

@PatrickLang

This comment has been minimized.

Copy link
Contributor

PatrickLang commented Jan 22, 2019

cc @yujuhong I discussed with SIG-Node today, and they don't have any concern with this as a point in time solution. NodeSelectors are widely used, and the labels needed in this PR are stable as of v1.14 #73048.

Long term, we could remove this node selector if one of the following is done

  1. CoreDNS is published as a multi-arch, multi-OS image
  2. Scheduler is updated to automatically pick nodes based on OS/arch (or other criteria) without NodeSelector. This could happen on its own, or if RuntimeClass is updated to factor in os/arch
@neolit123
Copy link
Member

neolit123 left a comment

I discussed with SIG-Node today, and they don't have any concern with this as a point in time solution. NodeSelectors are widely used, and the labels needed in this PR are stable as of v1.14 #73048.

Long term, we could remove this node selector if one of the following is done

CoreDNS is published as a multi-arch, multi-OS image
Scheduler is updated to automatically pick nodes based on OS/arch (or other criteria) without NodeSelector. This could happen on its own, or if RuntimeClass is updated to factor in os/arch

SGTM,
will leave the approval to @timothysc

@timothysc
Copy link
Member

timothysc left a comment

/approve
/lgtm

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jan 23, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MarcPow, MrHohn, timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@michmike

This comment has been minimized.

Copy link

michmike commented Jan 23, 2019

@timothysc could you please remove the hold so that it can be merged in?

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented Jan 23, 2019

/hold cancel

@PatrickLang

This comment has been minimized.

Copy link
Contributor

PatrickLang commented Jan 23, 2019

/test pull-kubernetes-godeps

@k8s-ci-robot k8s-ci-robot merged commit baaaa15 into kubernetes:master Jan 23, 2019

15 of 19 checks passed

pull-kubernetes-e2e-gce Job triggered.
Details
pull-kubernetes-e2e-gce-100-performance Job triggered.
Details
pull-kubernetes-e2e-kops-aws Job triggered.
Details
pull-kubernetes-kubemark-e2e-gce-big Job triggered.
Details
cla/linuxfoundation MarcPow authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Job succeeded.
Details
pull-kubernetes-e2e-kubeadm-gce Skipped
pull-kubernetes-godeps Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
tide In merge pool.
Details

SIG-Windows automation moved this from In Review to Done (v.1.14) Jan 23, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jan 23, 2019

@MarcPow: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws eb818f9 link /test pull-kubernetes-e2e-kops-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment