-
Notifications
You must be signed in to change notification settings - Fork 39.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bind coredns containers to linux nodes to avoid Windows scheduling #69940
Conversation
/kind bug |
/sig Windows This is a micro-improvement to Windows clusters at startup time. |
@chrisohaver @rajansandeep do you mind taking a look? Thanks. |
Is CoreDNS the only core k8s component that cant run on windows? |
I don't have sufficient expertise (yet) to comment on that. I just happened to get bitten by this one when setting a cluster up on a slightly older environment in Azure (where a combination of [seperately filed] bugs in the Docker EE + Azure CNI stacks led to the container to get 'stuck' on a particular node), leaving the cluster without DNS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. CoreDNS can't run on windows nodes currently.
The change should be proposed in other places too...
- kubernetes/kubeadm (has it's own copy of the coredns manifests)
- coredns/deployment (intended to be the authoritative k8s deployment default configuration)
@chrisohaver I'll update them in other install tools and in |
/retest |
Test failures related to pre-existing flaky test: #66542 |
/approve |
/assign @luxas |
/hold |
@@ -237,6 +237,8 @@ spec: | |||
operator: Exists | |||
- key: {{ .MasterTaintKey }} | |||
effect: NoSchedule | |||
nodeSelector: | |||
beta.kubernetes.io/os: linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are the docs that outline this is how we are planning todo hostOS steering? It's weird to use labels for things that need to be enforced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The os node labels are automatically injected.
But if you're suggesting this should be less manual, that's a fair point - I'd love to see the scheduler to inspect the relevant images and dynamically impose these constraints. But in the absence of such a feature, this seems like a reasonable place to start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably best to have an issue in the k/kubeadm repo about this and add a TODO comment in the code if this change is to make it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brendanburns - ping! What's the policy you are using for denoting windows nodes? and is there a KEP somewhere?
/cc @jbeda
To clarify, CoreDNS itself works fine on Windows. But we don't compile a windows executable into the CoreDNS docker manifest (AFAIK neither does kube-dns, so we probably want to add the node selector there too). As I understand it, there are some hurdles in generally getting K8s cluster DNS on Windows (not coredns specific). A significant one is that Windows systems don't have an |
there is a new WG in k8s that is currently discussing how multi-platform images are to be build and tagged. |
IMO, If it makes sense to add this node selector for the CoreDNS deployment manifests, it should also be added to the kube-dns deployment manifests. AFAIK, both have the same issue. |
I'm taking off approval, we don't have well defined policies on heterogeneous clustering that I'm aware of. |
There are no core-DNS pods based off of windowsservercore container images, so I do think we need a way to schedule to the right OS. The current thoughts are captured here: and I'm sure it will be discussed again in future SIG-Windows meetings. |
what is preventing the workaround of updating the deployment with amended node selector toleration? making a node selector OS bound makes sense only as a temporary measure. |
@daschott This requires a KEP and affects multiple SIGs.
|
cc @PatrickLang for our RuntimePolicy discussion |
@timothysc What should the KEP be titled? I'm trying to understand the scope of what you're looking for. |
going back to @chrisohaver 's comment.
it feels to me that we might want to get a DNS solution working on Windows eventually. |
Dns works on Windows pods and is a function of CNI doing the right thing. CoreDns is used as a Kubernetes service. Windows pods can simply do dns lookups using that service IP and it will route to a Linux node running a matching pod for that service. Because CoreDns doesn't have a Windows container published, it needs the nodeselector. |
cc @yujuhong I discussed with SIG-Node today, and they don't have any concern with this as a point in time solution. NodeSelectors are widely used, and the labels needed in this PR are stable as of v1.14 #73048. Long term, we could remove this node selector if one of the following is done
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed with SIG-Node today, and they don't have any concern with this as a point in time solution. NodeSelectors are widely used, and the labels needed in this PR are stable as of v1.14 #73048.
Long term, we could remove this node selector if one of the following is done
CoreDNS is published as a multi-arch, multi-OS image
Scheduler is updated to automatically pick nodes based on OS/arch (or other criteria) without NodeSelector. This could happen on its own, or if RuntimeClass is updated to factor in os/arch
SGTM,
will leave the approval to @timothysc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MarcPow, MrHohn, timothysc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@timothysc could you please remove the hold so that it can be merged in? |
/hold cancel |
/test pull-kubernetes-godeps |
@MarcPow: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #69773
Special notes for your reviewer:
This prevents scheduling attempts that are sure to fail on Windows agents. Given the current set of Windows issues, scheduling attempts may sometimes hang and prevent DNS from coming up cleanly in a cluster.
Release note: