New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TaintNodesByCondition not working as expected #53804

Closed
jamiehannaford opened this Issue Oct 12, 2017 · 11 comments

Comments

Projects
None yet
5 participants
@jamiehannaford
Member

jamiehannaford commented Oct 12, 2017

What happened:

When using the new --feature-gates=TaintNodesByCondition=true feature added in v1.8, pods still can't seem to be scheduled to NotReady nodes. Steps to recreate:

  1. Set --feature-gates=TaintNodesByCondition=true on apiserver, scheduler and controller manager

  2. Have 1 master and 1 node cluster, neither of which have CNI set up. This means that each should be NotReady and their NetworkUnavailable condition be true, like so:

    $ kubectl get nodes
    NAME       STATUS     ROLES     AGE       VERSION
    c1         NotReady   master    15m       v1.8.0
    worker-1   NotReady   <none>    15m       v1.8.0
    
    $ kubectl get nodes -o jsonpath='{range .items[*]}{@.metadata.name}{"\n"}{range @.status.conditions[*]}{@.type}={@.status}{"\n"}{end}{end}'
    c1
    OutOfDisk=False
    MemoryPressure=False
    DiskPressure=False
    Ready=False
    worker-1
    OutOfDisk=False
    MemoryPressure=False
    DiskPressure=False
    Ready=False
  3. Deploy a Pod with the following manifest:

kind: Pod
apiVersion: v1
metadata:
  name: test
spec:
  tolerations:
  - key: node.kubernetes.io/network-unavailable
    operator: Exists
    effect: NoSchedule
  - key: node.kubernetes.io/notReady
    operator: Exists
    effect: NoSchedule
  containers:
  - name: nginx
    image: nginx

The Pod is not deployed though:

Name:         test
Namespace:    default
Node:         <none>
Labels:       <none>
Annotations:  <none>
Status:       Pending
IP:
Containers:
  nginx:
    Image:        nginx
    Port:         <none>
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-jrrr7 (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  default-token-jrrr7:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-jrrr7
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.alpha.kubernetes.io/notReady:NoExecute for 300s
                 node.alpha.kubernetes.io/unreachable:NoExecute for 300s
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/notReady:NoSchedule
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  30s (x87 over 25m)  default-scheduler  No nodes are available that match all of the predicates: NodeNotReady (2), PodToleratesNodeTaints (1).

What you expected to happen:

A few things:

  1. the Pod to be deployed to worker-1. Maybe I'm using this feature incorrectly, but my assumption was that NotReady would no longer be a prohibiting characteristic if more granular taints are used (tied to a Node's conditions).

  2. the node.kubernetes.io/network-unavailable condition to appear when I run either kubectl get nodes or kubectl describe node. This is probably orthogonal to this issue.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"6e937839ac04a38cac63e6a7a306c5d035fe7b0a", GitTreeState:"clean", BuildDate:"2017-09-28T22:57:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.1", GitCommit:"f38e43b221d08850172a9a4ea785a86a3ffa3b3a", GitTreeState:"clean", BuildDate:"2017-10-11T23:16:41Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
  • OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="16.04.3 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.3 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
  • Kernel (e.g. uname -a): Linux c1 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: kubeadm

/cc @k82cn @kubernetes/sig-scheduling-bugs

@k82cn

This comment has been minimized.

Show comment
Hide comment
@k82cn

k82cn Oct 12, 2017

Member

@jamiehannaford , I think we need to cherrypick #52723 .

Member

k82cn commented Oct 12, 2017

@jamiehannaford , I think we need to cherrypick #52723 .

@jamiehannaford

This comment has been minimized.

Show comment
Hide comment
@jamiehannaford

jamiehannaford Oct 12, 2017

Member

@k82cn Cool, okay. Have these test failures been fixed btw?

@jdumars @jpbetz Would it be possible to cherry pick #52723 into v1.8.2?

Member

jamiehannaford commented Oct 12, 2017

@k82cn Cool, okay. Have these test failures been fixed btw?

@jdumars @jpbetz Would it be possible to cherry pick #52723 into v1.8.2?

@k82cn

This comment has been minimized.

Show comment
Hide comment
@k82cn

k82cn Oct 12, 2017

Member

Have these test failures been fixed btw?

Yes, I think we need to cherrypick: #53480 , #52723 and #53184 , including the integration test. I can create cherrypick PR for reivew/approve.

Member

k82cn commented Oct 12, 2017

Have these test failures been fixed btw?

Yes, I think we need to cherrypick: #53480 , #52723 and #53184 , including the integration test. I can create cherrypick PR for reivew/approve.

@liggitt

This comment has been minimized.

Show comment
Hide comment
@liggitt

liggitt Oct 12, 2017

Member

@jdumars @jpbetz Would it be possible to cherry pick #52723 into v1.8.2?

In my opinion, we shouldn't be picking alpha feature bugfixes to release branches

Member

liggitt commented Oct 12, 2017

@jdumars @jpbetz Would it be possible to cherry pick #52723 into v1.8.2?

In my opinion, we shouldn't be picking alpha feature bugfixes to release branches

@jamiehannaford

This comment has been minimized.

Show comment
Hide comment
@jamiehannaford

jamiehannaford Oct 12, 2017

Member

Hmm, okay. Why is that? Also, is there a formalised policy on what can/should be cherry-picked? Having granular taints would make self-hosting work a lot better in kubeadm (which is one of our key milestones for v1.9), so it'd be great if we don't have to wait until the end of the dev cycle.

Related issues: #45717 and #45665.

/cc @timothysc @luxas

Member

jamiehannaford commented Oct 12, 2017

Hmm, okay. Why is that? Also, is there a formalised policy on what can/should be cherry-picked? Having granular taints would make self-hosting work a lot better in kubeadm (which is one of our key milestones for v1.9), so it'd be great if we don't have to wait until the end of the dev cycle.

Related issues: #45717 and #45665.

/cc @timothysc @luxas

@liggitt

This comment has been minimized.

Show comment
Hide comment
@liggitt

liggitt Oct 12, 2017

Member

In my opinion, we shouldn't be picking alpha feature bugfixes to release branches

Let me qualify that slightly. Changes in a release branch are always about balancing risk against benefit. If a cherry pick only benefits an alpha feature, but carries any risk to supported features (like refactoring the scheduler start up flow), I don't think that is a good trade. If a cherry pick has significant benefits to an alpha feature, and is small and completely isolated, that seems more reasonable to consider (though I still wouldn't expect it to be the norm)

Having granular taints would make self-hosting work a lot better in kubeadm (which is one of our key milestones for v1.9), so it'd be great if we don't have to wait until the end of the dev cycle.

Why does kubeadm have to wait until the end of the dev cycle?

Member

liggitt commented Oct 12, 2017

In my opinion, we shouldn't be picking alpha feature bugfixes to release branches

Let me qualify that slightly. Changes in a release branch are always about balancing risk against benefit. If a cherry pick only benefits an alpha feature, but carries any risk to supported features (like refactoring the scheduler start up flow), I don't think that is a good trade. If a cherry pick has significant benefits to an alpha feature, and is small and completely isolated, that seems more reasonable to consider (though I still wouldn't expect it to be the norm)

Having granular taints would make self-hosting work a lot better in kubeadm (which is one of our key milestones for v1.9), so it'd be great if we don't have to wait until the end of the dev cycle.

Why does kubeadm have to wait until the end of the dev cycle?

@luxas

This comment has been minimized.

Show comment
Hide comment
@luxas

luxas Oct 14, 2017

Member

I agree with @liggitt that if this feature just doesn't work in v1.8 and the amount of changes that would have to be backported to v1.8 are substantial, the risk is too high. The feature is just broken then :(

Instead, let's make it rock-solid on the master branch -- v1.9 is not far away in any case.

Why does kubeadm have to wait until the end of the dev cycle?

I don't understand that either, but the sooner the feature is stable and working at HEAD, the better.

Member

luxas commented Oct 14, 2017

I agree with @liggitt that if this feature just doesn't work in v1.8 and the amount of changes that would have to be backported to v1.8 are substantial, the risk is too high. The feature is just broken then :(

Instead, let's make it rock-solid on the master branch -- v1.9 is not far away in any case.

Why does kubeadm have to wait until the end of the dev cycle?

I don't understand that either, but the sooner the feature is stable and working at HEAD, the better.

@jamiehannaford

This comment has been minimized.

Show comment
Hide comment
@jamiehannaford

jamiehannaford Oct 14, 2017

Member

Forgot to comment earlier, but yeah, @liggitt's points make a lot of sense. I agree with what you're saying. I made a mistake and assumed that we could only code to releases, but obviously if it exists in master branch, we can work on features that rely on it side-by-side.

Member

jamiehannaford commented Oct 14, 2017

Forgot to comment earlier, but yeah, @liggitt's points make a lot of sense. I agree with what you're saying. I made a mistake and assumed that we could only code to releases, but obviously if it exists in master branch, we can work on features that rely on it side-by-side.

@k82cn

This comment has been minimized.

Show comment
Hide comment
@k82cn

k82cn Oct 14, 2017

Member

so we will not cherrypick them :)

Member

k82cn commented Oct 14, 2017

so we will not cherrypick them :)

@luxas

This comment has been minimized.

Show comment
Hide comment
@luxas

luxas Oct 27, 2017

Member

@k82cn @jamiehannaford Does this work as it should now at HEAD? Then we should close this.

Member

luxas commented Oct 27, 2017

@k82cn @jamiehannaford Does this work as it should now at HEAD? Then we should close this.

@jamiehannaford

This comment has been minimized.

Show comment
Hide comment
@jamiehannaford

jamiehannaford Oct 27, 2017

Member

Yep, I got this working on HEAD.

Member

jamiehannaford commented Oct 27, 2017

Yep, I got this working on HEAD.

@k82cn k82cn referenced this issue Apr 19, 2018

Open

Upgrade TaintNodesByCondition to Beta #62109

13 of 15 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment