New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetworkUnavailable NoRouteCreated on GCE with CNI network plugin #44254

Open
davidjbrady opened this Issue Apr 10, 2017 · 7 comments

Comments

Projects
None yet
9 participants
@davidjbrady

davidjbrady commented Apr 10, 2017

Is this a request for help? Yes

What keywords did you search in Kubernetes issues before filing this one?
cloud, provider, gce, network, unavailable, cni, route, routecontroller


Is this a BUG REPORT or FEATURE REQUEST?
Bug Report

Kubernetes version v1.6.1:

Environment:

  • Cloud provider or hardware configuration: GCE
  • OS (e.g. from /etc/os-release): Container Linux by CoreOS beta (1353.4.0)
  • Kernel 4.10.4-coreos-r1
  • Install tools: N/A: Static Binaries launched via Systemd unit files
  • Others:

What happened:
If the following flags are set on kubernetes components. Pods cannot be scheduled. The scheduler reports no nodes are available. The status of all nodes report ' NetworkUnavailable True Mon, 01 Jan 0001 00:00:00 +0000 NoRouteCreated RouteController failed to create a route'.

Flags

  • kube-apiserver, kube-controller-manager, kubelet components started with the --cloud-provider=gce flag
  • kube-controller-manager started with the --cloud-provider=gce --allocate-node-cidrs=false and --configure-cloud-routes=false
  • kubelet component configured with the --cloud-provider=gce --network-plugin=cni

What you expected to happen:
I expect the kube-controller-manager RouteController would not attempt to create routes, instead delegating the responsibility to the CNI network plugin, and the making nodes available for scheduling.

How to reproduce it :
/opt/bin/kube-apiserver
--bind-address=0.0.0.0
--etcd-servers=http://127.0.0.1:2379
--allow-privileged=true
--service-cluster-ip-range=172.16.0.0/24
--secure-port=443
--advertise-address=${COREOS_PRIVATE_IPV4}
--authorization-mode=RBAC
--admission-control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ServiceAccount,DefaultStorageClass,ResourceQuota
--tls-cert-file=/etc/ssl/certs/server.pem
--tls-private-key-file=/etc/ssl/certs/server-key.pem
--client-ca-file=/etc/ssl/certs/ca.pem
--service-account-key-file=/etc/ssl/certs/server-key.pem
--storage-backend=etcd3
--runtime-config=api/all=true
--anonymous-auth=false
--cloud-provider=gce

/opt/bin/kube-controller-manager
--master=http://127.0.0.1:8080
--address=127.0.0.1
--kubeconfig=/etc/kubernetes/kubeconfig
--cluster-name=cluster.local
--service-cluster-ip-range=172.16.0.0/24
--service-account-private-key-file=/etc/ssl/certs/server-key.pem
--root-ca-file=/etc/ssl/certs/ca.pem
--v=2
--cloud-provider=gce
--allocate-node-cidrs=false
--configure-cloud-routes=false

/opt/bin/kube-scheduler
--master=http://127.0.0.1:8080
--address=127.0.0.1
--kubeconfig=/etc/kubernetes/kubeconfig
--v=2

/opt/bin/kubelet
--kubeconfig=/etc/kubernetes/kubeconfig
--require-kubeconfig=true
--allow-privileged=true
--register-node=true
--cni-bin-dir=/opt/cni/bin
--cni-conf-dir=/etc/kubernetes/cni/net.d
--network-plugin=cni
--cluster-dns=172.16.0.10
--cluster-domain=cluster.local
--client-ca-file=/etc/ssl/certs/ca.pem
--cloud-provider=gce

Anything else we need to know:
Removing --cloud-provider=gce from all components will resolve the issue but that causes other cloud-provider features like the kubernetes.io/gce-pd storage class provisioner to be unavailable.

@chrislovecnm

This comment has been minimized.

Member

chrislovecnm commented May 3, 2017

I believe this is a duplicate... #33573 would possibly fix this.

kubernetes/kops#2087 may have a workaround

@davidopp

This comment has been minimized.

Member

davidopp commented May 20, 2017

@burdiyan

This comment has been minimized.

burdiyan commented Jun 2, 2017

This is a huge blocker for us right now. I'm building Kubernetes cluster in GCE with CoreOS and Weave Net as CNI plugin. And there is no way to start kubelet with --cloud-provider=gce, because pods cannot be scheduled this way.

@glerchundi

This comment has been minimized.

glerchundi commented Jun 16, 2017

Same problem here with v1.6.4.

Which is the way to go, any work-around?

@jryberg

This comment has been minimized.

jryberg commented Dec 22, 2017

Hi,

Was this ever fixed?

I have manually built a Kubernetes cluster on top of Google Cloud using this method: https://github.com/kelseyhightower/kubernetes-the-hard-way

Google are just used as "bare metal" and it was working perfectly fine until I wanted to use gcePersistentDisk and enable --cloud-provider=gce

I'm no longer able to create new pods since I have the exact same issue but on v1.8.0 instead.

I found a work around here: #56794 and I was able to trick the system temporary using suggested method.

for i in `kubectl get nodes -o jsonpath='{.items[*].metadata.name}'`; do 
    curl http://localhost:8080/api/v1/nodes/$i/status > a.json
    cat a.json | tr -d '\n' | sed 's/{[^}]\+NetworkUnavailable[^}]\+}/{"type": "NetworkUnavailable","status": "False","reason": "RouteCreated","message": "Manually set through k8s api"}/g' > b.json
    curl -X PUT http://localhost:8080/api/v1/nodes/$i/status -H "Content-Type: application/json" -d @b.json
done

Is there actually no way of running Kubernetes on top of Google Cloud as "bare metal"?

Best regards Johan Ryberg

@fejta-bot

This comment has been minimized.

fejta-bot commented Mar 22, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@chrislovecnm

This comment has been minimized.

Member

chrislovecnm commented Mar 25, 2018

/lifecycle frozen
/remove-lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment