-
Notifications
You must be signed in to change notification settings - Fork 12
Error setting up network for a pod defined as manifest #55
Comments
I think this is the revelent part of kubelet log :
When an error occurs on the setup phase, the pod seems to stay up but without network configuration. |
@slaws Thanks for raising this. I'll be taking a look at this issue today and I'll post my findings. My initial hunch is that it is a either a race condition when creating pods via an on disk manifest, or potentially a subtle bug in the Kubelet's network plugin API, but I'll be able to say for sure after some more digging. |
@slaws - It looks to me like this is an ordering problem in the Kubelet where the Kubernetes API is not informed of this pod until after the network plugin is called. I've raised kubernetes/kubernetes#14992 to address this issue. |
@caseydavenport thanks a lot for this feedback. If this is the cause a potential workaround would be to use the experimental DaemonSet. I'll try to test this next week. |
@alexhersh has submitted a fix for this in kubernetes/kubernetes#16894 |
@slaws - The fix for this has been merged to Kubernetes upstream - hopefully it will be included in the next bugfix release. If you'd like, you can try it out using a master build of Kubernetes. I'm going to close this issue for now. |
Hello,
It seems I hit an issue with a pods defined as a manifests. The very first time (at least) a pod is started as a manifest, networking setup seems to stop before applying a profile. The container is not reachable from another host.
Some detail on my setup :
Calico-kubernetes plugin version : 0.2.0
calicoctl version : 0.7.0
Pool configured : 172.17.0.0/16 (ipip with workaround from calico-docker issue #426)
Node IP is 192.168.200.6
Master IP is 192.168.200.2
After a "fresh install" here is logs from the plugin : https://gist.github.com/slaws/995ae34856b6f8d8ddf0
Then a reboot : https://gist.github.com/slaws/bbb67679ba978c6000e3
And logs after a docker kill on the pod : https://gist.github.com/slaws/7bef22b5ae712c5fa756
After the docker kill, I can reach containers from another host.
I'll join logs from the kubelet (with --v=5) ASAP.
After talking with
The text was updated successfully, but these errors were encountered: