Failing to bring up pods #3

Open
ktsakalozos opened this Issue Jul 25, 2017 · 2 comments

Comments

Projects
None yet
2 participants

ktsakalozos commented Jul 25, 2017

Was deploying the bunlde http://paste.ubuntu.com/25168507/ on AWS.

Some pods do not start. Here is a view of kubectl get po --all-namespaces

 NAMESPACE     NAME                                    READY     STATUS             RESTARTS   AGE
default       default-http-backend-44ppj              1/1       Running            0          28m
default       nginx-ingress-controller-2165r          0/1       CrashLoopBackOff   18         28m
kube-system   heapster-v1.2.0.1-3929440214-25gsb      4/4       Running            0          26m
kube-system   kube-dns-4101612645-bcq53               2/4       CrashLoopBackOff   15         30m
kube-system   kubernetes-dashboard-3543765157-lck0f   0/1       CrashLoopBackOff   13         30m
kube-system   monitoring-influxdb-grafana-v4-13h6g    2/2       Running            0          30m

Here is the kubernetes-worker syslog: http://paste.ubuntu.com/25168517/
And here is the master's syslog: http://paste.ubuntu.com/25168528/

Here is the output of kubectl logs po/kube-dns-4101612645-bcq53 -n kube-system kubedns http://paste.ubuntu.com/25168569/
And here is the description of kubedns http://paste.ubuntu.com/25168572/

I see some error on the ovn-k8s-watcher.log http://paste.ubuntu.com/25168607/

Owner

AakashKT commented Jul 28, 2017

@ktsakalozos
I talked with Gurucharan Shetty @shettyg about this, the guy who is behind ovn-kubernetes https://github.com/shettyg/ovn-kubernetes
OVN does not currently support hostNetworking. What I mean by that is, it does not have facility to make the pod available at the host's ip. This is why, at "Liveness probe failed", we get connection refused and the pod restarts, again fails at "Liveness probe failed" to an infinite loop of this.

We also noticed that the pods which fail passed "hostNetwork : true" ('kubectl get pod nginx-ingress-controller-xxxx -o json' will show this) , which causes it to use the hosts IP, instead of the ip assigned to it by OVN. If we could somehow set that to false, we can test and see whether it actually works (Most likely it will).

To make it work as it is, some changes have to be incorporated into OVN, and that will most likely happen next week.

Thank you @AakashKT and @shettyg for your input on this. It is great to see that we will have hostNetwork support withing the next couple of weeks.

@AakashKT here is how you can test deploying services without hostNetwork. The nginx-ingress controller yaml is part of the kubernetes-worker charm. You can find the yaml under the templates folder. You can patch the ingress yaml and rebuild the charm.

For the rest of the services cycling you can provide your own cdk-addons juju resource. You will need to clone https://github.com/juju-solutions/cdk-addons.git . Look at the README to see how to build the addons package and then go through the Make file to see the steps followed; get-addon-templates grabs the yaml files of all the services, you may need to patch those yaml files before you actually continue with the rest of the build steps found on the Makefile. In the end you should get a cdk-addons_1.X.Y_amd64.snap file. You should attach this file as a resource to the kubernetes-master charm you are releasing, eg:

charm attach cs:~kos.tsakalozos/kubernetes-master-2  cdk-addons=cdk-addons_1.X.Y_amd64.snap
charm release cs:~kos.tsakalozos/kubernetes-master-2 -r cdk-addons-20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment