Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods cycling endlessly through Pending and Running in Guestbook example #4414

Closed
kbeecher opened this issue Feb 13, 2015 · 22 comments
Closed
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.

Comments

@kbeecher
Copy link
Contributor

I am trying to get the guestbook example (https://github.com/GoogleCloudPlatform/kubernetes/blob/master/examples/guestbook/README.md) working on a local vagrant cluster using Kubernetes v.0.10.1.

If you follow the guestbook tutorial, you initialise a cluster with one minion. In this case, when you create the two redis-slaves, only one gets attached to the minion and the others are unassigned. The same goes for the three frontend-controllers.

I tried with four minions and all pods were assigned. However, I noticed some strange behaviour: each frontend-controller kept periodically going into Pending and then back into Running but every time it was rerun it was assigned to the next IP address (i.e. last IP address + 1).

I also tried initialising my four-minion cluster with https://github.com/pires/kubernetes-vagrant-coreos-cluster, but got the same results.

Could this a problem with the example, or with Kubernetes itself?

@pires
Copy link
Contributor

pires commented Feb 13, 2015

I'm having all sorts of strange behavior with 0.10.1 as well in #4415. Can you try with 0.9.2 instead?

@kbeecher
Copy link
Contributor Author

@pires The versions I've already tried with are: 0.9.3, 0.9.2 and 0.10.1.

@brendandburns
Copy link
Contributor

Karl,
Chances are that your pod containers are crashing. Can you try kubectl log

And see what it prints?

Join us on irc #google-containers for more interactive debugging.

@pires, can you file issues on the problems you're seeing with 0.10.1?

Thanks!
Brendan
On Feb 13, 2015 3:15 AM, "Karl Beecher" notifications@github.com wrote:

@pires https://github.com/pires The versions I've already tried with
are: 0.9.3, 0.9.2 and 0.10.1.


Reply to this email directly or view it on GitHub
#4414 (comment)
.

@pires
Copy link
Contributor

pires commented Feb 13, 2015

@brendanburns you just saw it but #4415.

@saad-ali saad-ali added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels Feb 13, 2015
@errordeveloper
Copy link
Member

I am also seeing this.

There are many entries like this:

core@kube-01 ~ $ journalctl -u docker | grep error | tail -n1
Feb 16 18:45:57 kube-01 dockerd[975]: time="2015-02-16T18:45:57Z" level="error" msg="HTTP Error: statusCode=500 Cannot start container 9e94ba5ceb600550ab7868003f958228c0e05205915668480cccd04040b85abf: cannot join network of a non running container: bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903"

So let's look at just that last one...

core@kube-01 ~ $ journalctl -u docker | grep bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="+job log(create, bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903, kubernetes/pause:go)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="-job log(create, bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903, kubernetes/pause:go) = OK (0)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="POST /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/start"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="+job start(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="+job allocate_interface(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="-job allocate_interface(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="+job allocate_port(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="-job allocate_port(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="+job log(start, bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903, kubernetes/pause:go)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="-job log(start, bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903, kubernetes/pause:go) = OK (0)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="warning" msg="WARNING: Your kernel does not support OOM notifications: open /sys/fs/cgroup/memory/dockerbfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/memory.oom_control: no such file or directory"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="-job start(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="GET /v1.16/containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:44:56 kube-01 dockerd[975]: time="2015-02-16T18:44:56Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:44:59 kube-01 dockerd[975]: time="2015-02-16T18:44:59Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:44:59 kube-01 dockerd[975]: time="2015-02-16T18:44:59Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:44:59 kube-01 dockerd[975]: time="2015-02-16T18:44:59Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:00 kube-01 dockerd[975]: time="2015-02-16T18:45:00Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:00 kube-01 dockerd[975]: time="2015-02-16T18:45:00Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:00 kube-01 dockerd[975]: time="2015-02-16T18:45:00Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:05 kube-01 dockerd[975]: time="2015-02-16T18:45:05Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:05 kube-01 dockerd[975]: time="2015-02-16T18:45:05Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:05 kube-01 dockerd[975]: time="2015-02-16T18:45:05Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:06 kube-01 dockerd[975]: time="2015-02-16T18:45:06Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:06 kube-01 dockerd[975]: time="2015-02-16T18:45:06Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:06 kube-01 dockerd[975]: time="2015-02-16T18:45:06Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:10 kube-01 dockerd[975]: time="2015-02-16T18:45:10Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:10 kube-01 dockerd[975]: time="2015-02-16T18:45:10Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:10 kube-01 dockerd[975]: time="2015-02-16T18:45:10Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:15 kube-01 dockerd[975]: time="2015-02-16T18:45:15Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:15 kube-01 dockerd[975]: time="2015-02-16T18:45:15Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:15 kube-01 dockerd[975]: time="2015-02-16T18:45:15Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:16 kube-01 dockerd[975]: time="2015-02-16T18:45:16Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:16 kube-01 dockerd[975]: time="2015-02-16T18:45:16Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:16 kube-01 dockerd[975]: time="2015-02-16T18:45:16Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:20 kube-01 dockerd[975]: time="2015-02-16T18:45:20Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:20 kube-01 dockerd[975]: time="2015-02-16T18:45:20Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:20 kube-01 dockerd[975]: time="2015-02-16T18:45:20Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:25 kube-01 dockerd[975]: time="2015-02-16T18:45:25Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:25 kube-01 dockerd[975]: time="2015-02-16T18:45:25Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:25 kube-01 dockerd[975]: time="2015-02-16T18:45:25Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:26 kube-01 dockerd[975]: time="2015-02-16T18:45:26Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:26 kube-01 dockerd[975]: time="2015-02-16T18:45:26Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:26 kube-01 dockerd[975]: time="2015-02-16T18:45:26Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:30 kube-01 dockerd[975]: time="2015-02-16T18:45:30Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:30 kube-01 dockerd[975]: time="2015-02-16T18:45:30Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:30 kube-01 dockerd[975]: time="2015-02-16T18:45:30Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:35 kube-01 dockerd[975]: time="2015-02-16T18:45:35Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:35 kube-01 dockerd[975]: time="2015-02-16T18:45:35Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:35 kube-01 dockerd[975]: time="2015-02-16T18:45:35Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:36 kube-01 dockerd[975]: time="2015-02-16T18:45:36Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:36 kube-01 dockerd[975]: time="2015-02-16T18:45:36Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:36 kube-01 dockerd[975]: time="2015-02-16T18:45:36Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:40 kube-01 dockerd[975]: time="2015-02-16T18:45:40Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:40 kube-01 dockerd[975]: time="2015-02-16T18:45:40Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:40 kube-01 dockerd[975]: time="2015-02-16T18:45:40Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:45 kube-01 dockerd[975]: time="2015-02-16T18:45:45Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:45 kube-01 dockerd[975]: time="2015-02-16T18:45:45Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:45 kube-01 dockerd[975]: time="2015-02-16T18:45:45Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:46 kube-01 dockerd[975]: time="2015-02-16T18:45:46Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:46 kube-01 dockerd[975]: time="2015-02-16T18:45:46Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:46 kube-01 dockerd[975]: time="2015-02-16T18:45:46Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:50 kube-01 dockerd[975]: time="2015-02-16T18:45:50Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:50 kube-01 dockerd[975]: time="2015-02-16T18:45:50Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:50 kube-01 dockerd[975]: time="2015-02-16T18:45:50Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:55 kube-01 dockerd[975]: time="2015-02-16T18:45:55Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:45:55 kube-01 dockerd[975]: time="2015-02-16T18:45:55Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:55 kube-01 dockerd[975]: time="2015-02-16T18:45:55Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:56 kube-01 dockerd[975]: time="2015-02-16T18:45:56Z" level="info" msg="POST /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/stop?t=10"
Feb 16 18:45:56 kube-01 dockerd[975]: time="2015-02-16T18:45:56Z" level="info" msg="+job stop(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:56 kube-01 dockerd[975]: time="2015-02-16T18:45:56Z" level="info" msg="+job log(die, bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903, kubernetes/pause:go)"
Feb 16 18:45:56 kube-01 dockerd[975]: time="2015-02-16T18:45:56Z" level="info" msg="-job log(die, bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903, kubernetes/pause:go) = OK (0)"
Feb 16 18:45:56 kube-01 dockerd[975]: time="2015-02-16T18:45:56Z" level="info" msg="+job release_interface(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:45:56 kube-01 dockerd[975]: time="2015-02-16T18:45:56Z" level="info" msg="-job release_interface(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:56 kube-01 dockerd[975]: time="2015-02-16T18:45:56Z" level="info" msg="+job log(stop, bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903, kubernetes/pause:go)"
Feb 16 18:45:56 kube-01 dockerd[975]: time="2015-02-16T18:45:56Z" level="info" msg="-job log(stop, bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903, kubernetes/pause:go) = OK (0)"
Feb 16 18:45:56 kube-01 dockerd[975]: time="2015-02-16T18:45:56Z" level="info" msg="-job stop(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:45:57 kube-01 dockerd[975]: Cannot start container 9e94ba5ceb600550ab7868003f958228c0e05205915668480cccd04040b85abf: cannot join network of a non running container: bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903
Feb 16 18:45:57 kube-01 dockerd[975]: time="2015-02-16T18:45:57Z" level="error" msg="Handler for POST /containers/{name:.*}/start returned error: Cannot start container 9e94ba5ceb600550ab7868003f958228c0e05205915668480cccd04040b85abf: cannot join network of a non running container: bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903"
Feb 16 18:45:57 kube-01 dockerd[975]: time="2015-02-16T18:45:57Z" level="error" msg="HTTP Error: statusCode=500 Cannot start container 9e94ba5ceb600550ab7868003f958228c0e05205915668480cccd04040b85abf: cannot join network of a non running container: bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903"
Feb 16 18:46:00 kube-01 dockerd[975]: time="2015-02-16T18:46:00Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:46:00 kube-01 dockerd[975]: time="2015-02-16T18:46:00Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:46:00 kube-01 dockerd[975]: time="2015-02-16T18:46:00Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:46:01 kube-01 dockerd[975]: time="2015-02-16T18:46:01Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:46:01 kube-01 dockerd[975]: time="2015-02-16T18:46:01Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:46:01 kube-01 dockerd[975]: time="2015-02-16T18:46:01Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:46:06 kube-01 dockerd[975]: time="2015-02-16T18:46:06Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:46:06 kube-01 dockerd[975]: time="2015-02-16T18:46:06Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:46:06 kube-01 dockerd[975]: time="2015-02-16T18:46:06Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"
Feb 16 18:47:01 kube-01 dockerd[975]: time="2015-02-16T18:47:01Z" level="info" msg="GET /containers/bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/json"
Feb 16 18:47:01 kube-01 dockerd[975]: time="2015-02-16T18:47:01Z" level="info" msg="+job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903)"
Feb 16 18:47:01 kube-01 dockerd[975]: time="2015-02-16T18:47:01Z" level="info" msg="-job container_inspect(bfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903) = OK (0)"

It looks like kubernetes/pause:go dies, while we still are trying to use his namespace...

@errordeveloper
Copy link
Member

Looks like #2252 might be related.

@errordeveloper
Copy link
Member

For me this only happens with 0.10.1, and the 0.9.3 is perfectly fine.

errordeveloper added a commit to errordeveloper/weave-demos that referenced this issue Feb 17, 2015
@rsokolowski
Copy link
Contributor

I'm investigating the issue and can confirm reproducibility of the bug using the guestbook example.
From looking at kubelet logs it seems that all deaths start with following log:
I0217 14:09:11.265717 4900 kubelet.go:1089] pod "frontend-controller-jzjpe.default.api" container "php-redis" hash changed (3923444835 vs 3989308515). Container will be killed and re-created.

@rsokolowski
Copy link
Contributor

I've added some logging to the code that compute hash of a container (https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/kubelet/kubelet.go#L1049):
expectedHash := dockertools.HashContainer(&container)
glog.Warningf("Expected hash is %d for container %#v", expectedHash, container)

and I get weird results. Basically the printed content of container is the same, but the hash is different:
W0217 15:19:55.897365 4841 kubelet.go:1050] Expected hash is 3923444835 for container api.Container{Name:"php-redis", Image:"kubernetes/example-guestbook-php-redis", Command:[]string(nil), WorkingDir:"", Ports:[]api.Port{api.Port{Name:"", HostPort:0, ContainerPort:80, Protocol:"TCP", HostIP:""}}, Env:[]api.EnvVar(nil), Resources:api.ResourceRequirements{Limits:api.ResourceList{"cpu":resource.Quantity{Amount:0.100, Format:"DecimalSI"}, "memory":resource.Quantity{Amount:50000000.000, Format:"DecimalSI"}}}, VolumeMounts:[]api.VolumeMount(nil), LivenessProbe:(_api.Probe)(nil), ReadinessProbe:(_api.Probe)(nil), Lifecycle:(_api.Lifecycle)(nil), TerminationMessagePath:"/dev/termination-log", Privileged:false, ImagePullPolicy:"IfNotPresent", Capabilities:api.Capabilities{Add:[]api.CapabilityType(nil), Drop:[]api.CapabilityType(nil)}}
W0217 15:20:14.753869 4841 kubelet.go:1050] Expected hash is 3989308515 for container api.Container{Name:"php-redis", Image:"kubernetes/example-guestbook-php-redis", Command:[]string(nil), WorkingDir:"", Ports:[]api.Port{api.Port{Name:"", HostPort:0, ContainerPort:80, Protocol:"TCP", HostIP:""}}, Env:[]api.EnvVar(nil), Resources:api.ResourceRequirements{Limits:api.ResourceList{"cpu":resource.Quantity{Amount:0.100, Format:"DecimalSI"}, "memory":resource.Quantity{Amount:50000000.000, Format:"DecimalSI"}}}, VolumeMounts:[]api.VolumeMount(nil), LivenessProbe:(_api.Probe)(nil), ReadinessProbe:(_api.Probe)(nil), Lifecycle:(_api.Lifecycle)(nil), TerminationMessagePath:"/dev/termination-log", Privileged:false, ImagePullPolicy:"IfNotPresent", Capabilities:api.Capabilities{Add:[]api.CapabilityType(nil), Drop:[]api.CapabilityType(nil)}}

@rsokolowski
Copy link
Contributor

I wonder whether it might be related to #4462 reported by @smarterclayton since the container in question comes from Kubelet.pods member.

@brendandburns
Copy link
Contributor

I think the key is:

Your kernel does not support OOM notifications: open /sys/fs/cgroup/memory/dockerbfd28d56f538d8bd2bbdbffa7422a79d790b63334d1adff4acf9d250ce1b9903/memory.oom_control

@dchen1107 should we gracefully degrade here?

@dchen1107
Copy link
Member

@brendandburns I assume this is reported from docker, not kubelet and cadvisor.

memory.oom_control file is for OOM notification and other controls. docker/libcontainer has register to listen on OOM events from kernel, and process them. Without such information, when a docker container is oom killed by kernel, the container state will be at termination state without proper OOM killed information. Kubelet picks up such information if exist and report as part of ContainerStatus.

In this case, I don't think that is the root case, but I did see many ppl report to docker about such failure since 1.4 release, one example is docker/issues/9902

Now back to the initial issue, what is the root cause of the container is dead? sys oom killed or hit the memory limit? I checked the example, only frontend container has memory limit, which means both master and slave pods are running as unlimit (bound to capacity of course). The question is that why only have issues with the latest release? Because daemons (kubelet, docker, proxy, etc.) in latest release use more memory, which triggers sys oom? Or the problem is only limited to frontend pods? then we hit memroy limit of container itself?

@dchen1107
Copy link
Member

btw, the issue of "not support OOM notifications" from docker should be fixed in 1.5 release. Even it is not the root cause, but it causes the confusion on the failure, and hurts the debuggability of the ecosystem.

@kbeecher
Copy link
Contributor Author

@brendanburns You asked for a log output. I chose one of the frontend-controllers.

$ kubectl log frontend-controller-nyxxv php-redis
2015-02-18T12:15:08.471077604Z 2015-02-18 12:15:08,470 CRIT Supervisor running as root (no user in config file)
2015-02-18T12:15:08.471193473Z 2015-02-18 12:15:08,471 WARN Included extra file "/etc/supervisor/conf.d/supervisord-apache2.conf" during parsing
2015-02-18T12:15:08.490090604Z 2015-02-18 12:15:08,490 INFO RPC interface 'supervisor' initialized
2015-02-18T12:15:08.490193710Z 2015-02-18 12:15:08,490 WARN cElementTree not installed, using slower XML parser for XML-RPC
2015-02-18T12:15:08.490277084Z 2015-02-18 12:15:08,490 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2015-02-18T12:15:08.490413161Z 2015-02-18 12:15:08,490 INFO supervisord started with pid 9
2015-02-18T12:15:09.493136452Z 2015-02-18 12:15:09,492 INFO spawned: 'apache2' with pid 10
2015-02-18T12:15:10.512577827Z 2015-02-18 12:15:10,512 INFO success: apache2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

That's all there was. Other pods returned the same kind of log output.

@rsokolowski
Copy link
Contributor

I've synced my workspace to HEAD and applied @brendandburns patch #4494 and it still doesn't seem to fix the problem. I've added quite a lot of debug printing around the HashContainer function and exactly when the issue happens (death of a container due to hash mismatch) I see two calls to that function with the same container and different output hash:
I0218 13:55:55.782519 19728 docker.go:573] Hashing container: &{php-redis kubernetes/example-guestbook-php-redis [] [{ 8000 80 TCP }] [] {map[cpu:{0.100 DecimalSI} memory:{50000000.000 DecimalSI}]} [] <
nil> /dev/termination-log false IfNotPresent {[] []}}
I0218 13:55:55.782751 19728 docker.go:579] Sum32: 4121691387

I0218 13:55:55.782802 19728 docker.go:573] Hashing container: &{php-redis kubernetes/example-guestbook-php-redis [] [{ 8000 80 TCP }] [] {map[cpu:{0.100 DecimalSI} memory:{50000000.000 DecimalSI}]} [] <
nil> /dev/termination-log false IfNotPresent {[] []}}
I0218 13:55:55.782982 19728 docker.go:579] Sum32: 4055827707

I've been starting that the hashing code for a while and can't figure out where the issue is coming from.
Once I've noticed that the printing function doesn't always prints maps in the same order (see memory printed before cpu), but that doesn't seem to be the cause of the issue:
I0218 12:46:31.148443 31391 docker.go:565] Hashing container: &{php-redis kubernetes/example-guestbook-php-redis [] [{ 8000 80 TCP }] [] {map[memory:{50000000.000 DecimalSI} cpu:{0.100 DecimalSI}]} [] <
nil> /dev/termination-log false IfNotPresent {[] []}}
I0218 12:46:31.148513 31391 docker.go:569] Sum32: 4055827707

@rsokolowski
Copy link
Contributor

Ok, I think I can finally confirm that the issue is due to non deterministic printing behavior of following construct:
spew.Fprintf(hash, "%#v", *container)
that is used in https://github.com/GoogleCloudPlatform/kubernetes/blob/af2ded7b0286dfa7ab5a243aa2b200582f24581c/pkg/util/hash.go#L28:
func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
spew.Fprintf(hasher, "%#v", objectToWrite)
}

I've did wrap the hasher with io.Writer that prints the input to the hash before hashing it and the result is not deterministic with respect to map elements (this is []bytes to string conversion of input to adler32.Write() method):
(api.Container){Name:(string)php-redis Image:(string)kubernetes/example-guestbook-php-redis Command:([]string) WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000 ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}] Env:([]api.EnvVar) Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[memory:50M cpu:100m]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(_api.Probe) ReadinessProbe:(_api.Probe) Lifecycle:(_api.Lifecycle) TerminationMessagePath:(string)/dev/termination-log Privileged:(bool)false ImagePullPolicy:(api.PullPolicy)IfNotPresent Capabilities:(api.Capabilities){Add:([]api.CapabilityType) Drop:([]api.CapabilityType)}}
(api.Container){Name:(string)php-redis Image:(string)kubernetes/example-guestbook-php-redis Command:([]string) WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000 ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}] Env:([]api.EnvVar) Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[cpu:100m memory:50M]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(_api.Probe) ReadinessProbe:(_api.Probe) Lifecycle:(_api.Lifecycle) TerminationMessagePath:(string)/dev/termination-log Privileged:(bool)false ImagePullPolicy:(api.PullPolicy)IfNotPresent Capabilities:(api.Capabilities){Add:([]api.CapabilityType) Drop:([]api.CapabilityType)}}

I will work on the fix.

@errordeveloper
Copy link
Member

Great findings!

On Thu, 19 Feb 2015 14:55 rsokolowski notifications@github.com wrote:

Ok, I think I can finally confirm that the issue is due to non
deterministic printing behavior of following construct:
spew.Fprintf(hash, "%#v", *container)
that is used in
https://github.com/GoogleCloudPlatform/kubernetes/blob/af2ded7b0286dfa7ab5a243aa2b200582f24581c/pkg/util/hash.go#L28
:
func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
spew.Fprintf(hasher, "%#v", objectToWrite)
}

I've did wrap the hasher with io.Writer that prints the input to the hash
before hashing it and the result is not deterministic with respect to map
elements (this is []bytes to string conversion of input to adler32.Write()
method):
(api.Container){Name:(string)php-redis
Image:(string)kubernetes/example-guestbook-php-redis Command:([]string)
WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000
ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}]
Env:([]api.EnvVar)
Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[memory:50M
cpu:100m]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(_api.Probe)
ReadinessProbe:(_api.Probe) Lifecycle:(
_api.Lifecycle) TerminationMessagePath:(string)/dev/termination-log
Privileged:(bool)false ImagePullPolicy:(api.PullPolicy)IfNotPresent
Capabilities:(api.Capabilities){Add:([]api.CapabilityType)
Drop:([]api.CapabilityType)}} (api.Container){Name:(string)php-redis
Image:(string)kubernetes/example-guestbook-php-redis Command:([]string)
WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000
ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}]
Env:([]api.EnvVar)
Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[cpu:100m
memory:50M]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(_api.Probe)
ReadinessProbe:(_api.Probe) Lifecycle:(_api.Lifecycle)
TerminationMessagePath:(string)/dev/termination-log Privileged:(bool)false
ImagePullPolicy:(api.PullPolicy)IfNotPresent
Capabilities:(api.Capabilities){Add:([]api.CapabilityType)
Drop:([]api.CapabilityType)}}

I will work on the fix.


Reply to this email directly or view it on GitHub
#4414 (comment)
.

@smarterclayton
Copy link
Contributor

Are we using the fuzzer to test the hasher? Would be a good way to verify the hasher.

On Feb 19, 2015, at 9:55 AM, rsokolowski notifications@github.com wrote:

Ok, I think I can finally confirm that the issue is due to non deterministic printing behavior of following construct:
spew.Fprintf(hash, "%#v", *container)
that is used in https://github.com/GoogleCloudPlatform/kubernetes/blob/af2ded7b0286dfa7ab5a243aa2b200582f24581c/pkg/util/hash.go#L28:
func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
spew.Fprintf(hasher, "%#v", objectToWrite)
}

I've did wrap the hasher with io.Writer that prints the input to the hash before hashing it and the result is not deterministic with respect to map elements (this is []bytes to string conversion of input to adler32.Write() method):
(api.Container){Name:(string)php-redis Image:(string)kubernetes/example-guestbook-php-redis Command:([]string) WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000 ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}] Env:([]api.EnvVar) Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[memory:50M cpu:100m]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(api.Probe) ReadinessProbe:(api.Probe) Lifecycle:(api.Lifecycle) TerminationMessagePath:(string)/dev/termination-log Privileged:(bool)false ImagePullPolicy:(api.PullPolicy)IfNotPresent Capabilities:(api.Capabilities){Add:([]api.CapabilityType) Drop:([]api.CapabilityType)}}
(api.Container){Name:(string)php-redis Image:(string)kubernetes/example-guestbook-php-redis Command:([]string) WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000 ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}] Env:([]api.EnvVar) Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[cpu:100m memory:50M]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(api.Probe) ReadinessProbe:(api.Probe) Lifecycle:(api.Lifecycle) TerminationMessagePath:(string)/dev/termination-log Privileged:(bool)false ImagePullPolicy:(api.PullPolicy)IfNotPresent Capabilities:(api.Capabilities){Add:([]api.CapabilityType) Drop:([]api.CapabilityType)}}

I will work on the fix.


Reply to this email directly or view it on GitHub.

@brendandburns
Copy link
Contributor

Thanks for digging so deep on this!

Yeah we should add hash validation to the fuzzer tests...

I'll file an issue.
Brendan
On Feb 19, 2015 7:51 AM, "Clayton Coleman" notifications@github.com wrote:

Are we using the fuzzer to test the hasher? Would be a good way to verify
the hasher.

On Feb 19, 2015, at 9:55 AM, rsokolowski notifications@github.com
wrote:

Ok, I think I can finally confirm that the issue is due to non
deterministic printing behavior of following construct:
spew.Fprintf(hash, "%#v", *container)
that is used in
https://github.com/GoogleCloudPlatform/kubernetes/blob/af2ded7b0286dfa7ab5a243aa2b200582f24581c/pkg/util/hash.go#L28:

func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
spew.Fprintf(hasher, "%#v", objectToWrite)
}

I've did wrap the hasher with io.Writer that prints the input to the
hash before hashing it and the result is not deterministic with respect to
map elements (this is []bytes to string conversion of input to
adler32.Write() method):
(api.Container){Name:(string)php-redis
Image:(string)kubernetes/example-guestbook-php-redis Command:([]string)
WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000
ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}]
Env:([]api.EnvVar)
Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[memory:50M
cpu:100m]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(api.Probe)
ReadinessProbe:(api.Probe) Lifecycle:(api.Lifecycle)
TerminationMessagePath:(string)/dev/termination-log Privileged:(bool)false
ImagePullPolicy:(api.PullPolicy)IfNotPresent
Capabilities:(api.Capabilities){Add:([]api.CapabilityType)
Drop:([]api.CapabilityType)}}
(api.Container){Name:(string)php-redis
Image:(string)kubernetes/example-guestbook-php-redis Command:([]string)
WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000
ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}]
Env:([]api.EnvVar)
Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[cpu:100m
memory:50M]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(api.Probe)
ReadinessProbe:(api.Probe) Lifecycle:(api.Lifecycle)
TerminationMessagePath:(string)/dev/termination-log Privileged:(bool)false
ImagePullPolicy:(api.PullPolicy)IfNotPresent
Capabilities:(api.Capabilities){Add:([]api.CapabilityType)
Drop:([]api.CapabilityType)}}

I will work on the fix.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#4414 (comment)
.

@brendandburns
Copy link
Contributor

The right answer here might be a more targeted hash function that wipes out
fields that we dont care about for the purpose of hashing...
On Feb 19, 2015 8:56 AM, "Brendan Burns" bburns@google.com wrote:

Thanks for digging so deep on this!

Yeah we should add hash validation to the fuzzer tests...

I'll file an issue.
Brendan
On Feb 19, 2015 7:51 AM, "Clayton Coleman" notifications@github.com
wrote:

Are we using the fuzzer to test the hasher? Would be a good way to verify
the hasher.

On Feb 19, 2015, at 9:55 AM, rsokolowski notifications@github.com
wrote:

Ok, I think I can finally confirm that the issue is due to non
deterministic printing behavior of following construct:
spew.Fprintf(hash, "%#v", *container)
that is used in
https://github.com/GoogleCloudPlatform/kubernetes/blob/af2ded7b0286dfa7ab5a243aa2b200582f24581c/pkg/util/hash.go#L28:

func DeepHashObject(hasher hash.Hash, objectToWrite interface{}) {
spew.Fprintf(hasher, "%#v", objectToWrite)
}

I've did wrap the hasher with io.Writer that prints the input to the
hash before hashing it and the result is not deterministic with respect to
map elements (this is []bytes to string conversion of input to
adler32.Write() method):
(api.Container){Name:(string)php-redis
Image:(string)kubernetes/example-guestbook-php-redis Command:([]string)
WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000
ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}]
Env:([]api.EnvVar)
Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[memory:50M
cpu:100m]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(api.Probe)
ReadinessProbe:(api.Probe) Lifecycle:(api.Lifecycle)
TerminationMessagePath:(string)/dev/termination-log Privileged:(bool)false
ImagePullPolicy:(api.PullPolicy)IfNotPresent
Capabilities:(api.Capabilities){Add:([]api.CapabilityType)
Drop:([]api.CapabilityType)}}
(api.Container){Name:(string)php-redis
Image:(string)kubernetes/example-guestbook-php-redis Command:([]string)
WorkingDir:(string) Ports:([]api.Port)[{Name:(string) HostPort:(int)8000
ContainerPort:(int)80 Protocol:(api.Protocol)TCP HostIP:(string)}]
Env:([]api.EnvVar)
Resources:(api.ResourceRequirements){Limits:(api.ResourceList)map[cpu:100m
memory:50M]} VolumeMounts:([]api.VolumeMount) LivenessProbe:(api.Probe)
ReadinessProbe:(api.Probe) Lifecycle:(api.Lifecycle)
TerminationMessagePath:(string)/dev/termination-log Privileged:(bool)false
ImagePullPolicy:(api.PullPolicy)IfNotPresent
Capabilities:(api.Capabilities){Add:([]api.CapabilityType)
Drop:([]api.CapabilityType)}}

I will work on the fix.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#4414 (comment)
.

@rsokolowski
Copy link
Contributor

After making hash function deterministic containers are no longer cycling in my cluster: the frontend container has been running without a restart for 15 hours. Closing.

@dchen1107
Copy link
Member

Thanks for fixing the issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.
Projects
None yet
Development

No branches or pull requests

8 participants