New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drain failed #251

Closed
obeyler opened this Issue Sep 17, 2018 · 5 comments

Comments

Projects
None yet
4 participants
@obeyler

obeyler commented Sep 17, 2018

What happened:
drain of some worker failed during bosh update deploiement

https://github.com/cloudfoundry-incubator/kubo-release/blob/master/jobs/kubelet/templates/bin/drain.erb#26

during drain the command is launched :
kubectl drain 'vm-06ea0261-c8f5-410b-99d2-cf93a2d6fd76 vm-a0708eae-26fb-4bc9-8adb-dae5012563c1' --grace-period 10 --force --delete-local-data --ignore-daemonsets

with this result :
Error from server (NotFound): nodes "vm-06ea0261-c8f5-410b-99d2-cf93a2d6fd76 vm-a0708eae-26fb-4bc9-8adb-dae5012563c1" not found

because some node share same ip (see 192.168.245.204 ) => I don't know why this occured

kubectl get nodes -o wide -L spec.ip

NAME                                      STATUS                     ROLES     AGE       VERSION   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME     SPEC.IP

vm-06ea0261-c8f5-410b-99d2-cf93a2d6fd76   Ready,SchedulingDisabled   <none>    2d        v1.11.2   <none>        Ubuntu 14.04.5 LTS   4.4.0-134-generic   docker://17.12.1-ce   192.168.245.204

vm-3f527920-636a-4b7e-ab2a-28596a8f019b   Ready                      <none>    4d        v1.11.2   <none>        Ubuntu 14.04.5 LTS   4.4.0-131-generic   docker://17.12.1-ce   192.168.245.210

vm-534bea32-27ab-44f4-81c3-3b58f55e05ef   Ready                      <none>    4d        v1.11.2   <none>        Ubuntu 14.04.5 LTS   4.4.0-131-generic   docker://17.12.1-ce   192.168.245.211

vm-6a38f5ca-a84c-45d7-9b68-12e5933297bb   Ready                      <none>    4d        v1.11.2   <none>        Ubuntu 14.04.5 LTS   4.4.0-131-generic   docker://17.12.1-ce   192.168.245.206

vm-a0708eae-26fb-4bc9-8adb-dae5012563c1   Ready,SchedulingDisabled   <none>    4d        v1.11.2   <none>        Ubuntu 14.04.5 LTS   4.4.0-131-generic   docker://17.12.1-ce   192.168.245.204

vm-d037e4e8-b115-4fad-8143-7e8186f4fd8c   Ready                      <none>    4d        v1.11.2   <none>        Ubuntu 14.04.5 LTS   4.4.0-131-generic   docker://17.12.1-ce   192.168.245.208

vm-e7afe971-69d4-40dd-8be5-d843891bb793   Ready                      <none>    4d        v1.11.2   <none>        Ubuntu 14.04.5 LTS   4.4.0-131-generic   docker://17.12.1-ce   192.168.245.209

What you expected to happen:
may the script drain should detect that two nodes share same ip and separate the drain command

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:
we have a lot of Terminating Pod :error (env 80 ) may it can be a source of trouble for the nodes/ip

Environment:

  • Deployment Info (bosh -d <deployment> deployment):
    cfcr-deployment-21
  • Environment Info (bosh -e <environment> environment):
  • Kubernetes version (kubectl version):
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:22:21Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider (e.g. aws, gcp, vsphere):
    openstack/ FE Orange
@cf-gitbot

This comment has been minimized.

cf-gitbot commented Sep 17, 2018

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/160559868

The labels on this github issue will be updated when the story is started.

@alex-slynko

This comment has been minimized.

Member

alex-slynko commented Sep 21, 2018

Hi @obeyler

I created a small PR for your problem. It will run the pipeline and we will merge it. You can try to build kubo-release from that branch and try it on your environment

@cf-gitbot cf-gitbot added in progress and removed scheduled labels Sep 28, 2018

@obeyler

This comment has been minimized.

obeyler commented Sep 28, 2018

It seems to work did you plan to integrate this pr in next release ?

@iainsproat

This comment has been minimized.

Member

iainsproat commented Oct 1, 2018

PR has been merged into our develop branch. We're currently waiting for it to pass the main CI pipeline. We plan to include this in our next release.

@cf-gitbot cf-gitbot added delivered and removed in progress labels Oct 2, 2018

@cf-gitbot cf-gitbot added accepted and removed delivered labels Oct 12, 2018

@alex-slynko

This comment has been minimized.

Member

alex-slynko commented Oct 19, 2018

The fix is included in 0.23.

@cf-gitbot cf-gitbot removed the accepted label Oct 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment