Compatibility with Calico / Canal CNI networking #442
Comments
I hit this issue to running on a k8s 1.3.4 cluster in AWS w/ calico networking and workflow v2.3.0. I couldn't deploy applications via a I don't have time to further debug this at the moment but feel free to ping me if you need a test subject when trying to resolve this issue. For reference I've included the errors I saw in this scenario. From my local system:
From the deis controller:
|
So it looks like calico networking disallows the host from communicating to the pod IPs. That is an interesting assumption that we were not aware of. We assumed that service IPs are available across the entire cluster, both for pods and for hosts. I'm not sure what the resolution would be here. Perhaps there's some way to allow worker nodes to communicate with service IPs with calico networking enabled? |
@bacongobbler Have you set up a k8s cluster with canal networking to check what's going on? (You can do that with kube-aws) |
I have not, however I believe I recall a little birdy inside the Deis org or in #community in Slack reaching out and we debugged their networking issues, and as it turned out it was due to host -> pod networking "failing" using Calico/canal networking. |
I am seeing the same behavior on a kube-aws / k8s 1.4.1 cluster with workflow v2.7.0 and flanel. Deploying buildpack apps works, but
|
@felixbuenemann @jdumars and I were actually talking about this in PM on slack. Can you try to jump onto one of the worker nodes and determine if there is something listening on port 5555 on the host? In our debugging we found that for some reason kubernetes did not allocate port 5555 on the host for the registry-proxy pod, which would explain why the connection is being refused. |
Also note that this seems to be CoreOS-specific. We're not seeing this issue on other providers using kube-up (Vagrant, AWS and GKE) or minikube, which are the 4 we are using according to our release test matrix. Vagrant uses Fedora, AWS uses Ubuntu, GKE uses Debian and Minikube uses a custom ISO. |
@bacongobbler I ran |
I think I was the little birdy @bacongobbler mentioned. My investigations led me to this upstream issue that appeared to be relevant: projectcalico/k8s-exec-plugin#52 (comment) I planned on investigating the issue in more depth but haven't got a chance to mess with calico/canal networking again |
i think this could be related to this issue kubernetes/kubernetes#31307 where there are issue with host port using the cni plugin |
Another related issue upstream would be kubernetes/kubernetes#23920, so it does seem like host ports on some providers (like CoreOS) doesn't seem to be working. |
Since we've seem to have found the root issue, I'm going to close this as a duplicate of deis/registry#64, which I've posted a patch that users can try to bypass the issue here. Thank you everyone for all your help! |
I tried to switch my coreos-kubernetes 1.3.4 / deis workflow 2.3.0 based cluster to use calico networking today.
This worked fine for the whole cluster, except for Dockerfile builds, which had problem with network connectivity.
My guess would be that the docker-in-docker does not use the proper cni network plugin.
The text was updated successfully, but these errors were encountered: