Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exclude neighbourhood pods from Nodes where scheduling is disabled #11

Closed
ghouscht opened this issue Dec 1, 2020 · 3 comments
Closed

Comments

@ghouscht
Copy link
Contributor

ghouscht commented Dec 1, 2020

Problem statement

Currently kubenurse discovers all running neighbour Pods (see kubediscovery.go). If we perform maintenance on a Node it is possible that the kubenurse instance on this node can't be reached - which is not neccesairly a problem. Thus graphs/metrics might show errors (or even trigger false alarms).

Proposal

Exclude kubenurse instances from Nodes where scheduling is disabled.

Further enhancement

Disable checks entirely on a kubenurse instance if the node the instance runs on has scheduling disabled (to avoid possible service check errors for example).

@ghouscht
Copy link
Contributor Author

ghouscht commented Dec 1, 2020

Due to the fact that kubenurse runs as a DaemonSet we're not able to evict pods from Nodes which are not schedulable: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#taints-and-tolerations. Which means we need to solve this in the discovery mechanism of kubenurse.

@ghouscht ghouscht closed this as completed Dec 9, 2020
@ghouscht ghouscht reopened this Dec 9, 2020
@ghouscht
Copy link
Contributor Author

ghouscht commented Dec 9, 2020

Neighbour pods are now excluded with #13. Let's now find a way to disable checks if the node is unschedulable.

@clementnuss
Copy link
Contributor

@ghouscht

as there weren't many interaction on this issue, I think that there is not enough interest in disabling checks when the node is unschedulable.

also, given the default tolerations of the daemonset, when a node is NotReady, with the node.kubernetes.io/unreachable:NoExecute taint, the pod will be deleted.

if someone needs this feature in the future, feel free to reopen 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants