Skip to content
This repository has been archived by the owner on Nov 7, 2018. It is now read-only.

Killing machine doesn't migrate the pod #6

Closed
ehacke opened this issue Mar 12, 2015 · 5 comments
Closed

Killing machine doesn't migrate the pod #6

ehacke opened this issue Mar 12, 2015 · 5 comments
Labels

Comments

@ehacke
Copy link

ehacke commented Mar 12, 2015

I'm running your elasticsearch cluster project on top of your kubernetes coreos vagrant project.

First of all, thanks for both of these, they work great and have saved me a lot of work.

However, it looks like if I deploy the elasticsearch cluster on top of 1 master and 4 minions, and then kill one of the machines running an elasticsearch node, kubernetes does not migrate the pod to another machine. In fact, it never even seems to notice that the machine is dead. If I run kubectl get pods it still shows the pod as "running", even though the machine is gone.

I've tried updating to kubernetes 0.12.1, and tried killing different elasticsearch machines, and it doesn't seem to make a difference.

Also, if I just kill the docker container on the machine, but leave the machine up, kubernetes will notice and move the pod as expected.

Any insight into why this is happening? Am I missing something?

Or should I take this up with the kubernetes team?

@ehacke
Copy link
Author

ehacke commented Mar 12, 2015

Ha. Ok, so I stand corrected. It does work.

It just takes really long, it took 13 minutes to figure out that the machine was gone. That's seems far too long, but I imagine there is some way I can over-ride the default heartbeat timeout.

You can close this issue if you'd like.

@ehacke ehacke closed this as completed Mar 13, 2015
@pires
Copy link
Owner

pires commented Mar 13, 2015

@ehacke I believe this is a Kubernetes issue that I've discussed earlier, but I've tried to look for it without success. Can you open a new issue there and ref this one, please?

@pires pires reopened this Mar 13, 2015
@pires pires added the bug label Mar 13, 2015
@ehacke
Copy link
Author

ehacke commented Mar 13, 2015

@pires Yes, will do.

@pires
Copy link
Owner

pires commented Mar 17, 2015

Thank you, @ehacke. It seems I was right. Let's keep in touch and close this whenever the fix is merged.

@ehacke
Copy link
Author

ehacke commented Mar 17, 2015

@pires yeah it looks like they are fixing it fairly quickly.

Not certain if my original 13 minute pod rejection issue is real/reproducible, but it does look like they are fixing the erroneous pod status reporting.

@pires pires closed this as completed May 1, 2015
@pires pires added wontfix and removed bug labels May 1, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants