Weave randomly picks ipv6 and everything breaks #45858

hollowimage · 2017-05-15T23:07:57Z

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): no

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.):

Is this a BUG REPORT or FEATURE REQUEST? (choose one): bug report

Kubernetes version (use kubectl version): 1.6.2 and 1.6.3

Environment:

Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release): debian jessie
Kernel (e.g. uname -a): n/a
Install tools: kops
Others: weave for CNI

What happened:
this morning after our cluster scaled nodes back up, the kube-dns would fail to start. after endless troubleshooting, the issue went away on its own, but i did notice one thing. I believe this is related to weave pods snagging up ipv6 on the weave interface inside the pod, or lack of ipv4 really...

weave     Link encap:Ethernet  HWaddr ca:aa:32:95:7d:4b
          inet6 addr: fe80::c8aa:32ff:fe95:7d4b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1376  Metric:1
          RX packets:7 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:460 (460.0 B)  TX bytes:648 (648.0 B)

and the kube-dns would show up with IP address in get pods as bearing the ip of the node, instead of the in-cluster 10. range.

after it "fixed itself" the weave interface looked like:

weave     Link encap:Ethernet  HWaddr 76:e6:c0:5a:2e:c5
          inet addr:10.36.0.0  Bcast:0.0.0.0  Mask:255.240.0.0
          inet6 addr: fe80::74e6:c0ff:fe5a:2ec5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1376  Metric:1
          RX packets:775 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:22188 (21.6 KiB)  TX bytes:690 (690.0 B)

What you expected to happen: not using ipv6, or adding a check to the CNI networking? not sure how best to handle it.

How to reproduce it (as minimally and precisely as possible):
n/a -- i do not know. it may be a race condition? but it went way as mysteriously as it happened, and i was not able to change anything over the course of the whole day.

Anything else we need to know:
during the course of the day, the cluster was "reset" (i.e. i termianted all instances) 10+ times.

the issue manifests by kube-dns not being able to start, and nodes bouncing up/down due to the PLEG events throwing a negative.

The text was updated successfully, but these errors were encountered:

cmluciano · 2017-05-16T15:18:08Z

@hollowimage Is there a bug open on the Weave repository as well.

cc @bboreham

cmluciano · 2017-05-16T15:18:28Z

/area network

k8s-ci-robot · 2017-05-16T15:18:33Z

@cmluciano: These labels do not exist in this repository: area/network.

In response to this:

/area network

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

cmluciano · 2017-05-16T15:19:27Z

/label sig/network

hollowimage · 2017-05-16T15:26:05Z

@cmluciano i dont know, but i pinged the guys in their slack with this issue. honestly i just wanted to make a record of this somewhere at the time before it got lost.

cmluciano · 2017-05-16T16:38:37Z

I think we should open the issue on the Weave repository unless this affects more than one CNI plugin.

bboreham · 2017-05-16T17:00:10Z

Agreed you should open this issue on the Weave repo.

Looking at the symptoms, that ipv6 address is just a link-local one automagically generated; the issue is that the Weave Net startup has not yet assigned an IPv4 address.

The reason for that may be in the (Docker) logs of the weave container.

cmluciano · 2017-05-16T17:52:21Z

Thanks @bboreham . @hollowimage please open this issue on the Weave repository

cmluciano · 2017-05-16T17:52:28Z

/assign

cmluciano · 2017-05-16T17:52:35Z

/close

cmluciano added the sig/network Categorizes an issue or PR as relevant to SIG Network. label May 16, 2017

k8s-ci-robot assigned cmluciano May 16, 2017

k8s-ci-robot closed this as completed May 16, 2017

hollowimage mentioned this issue May 16, 2017

Weave fail to assign IPv4 / Remove ephemeral peers from Weave Net via AWS ASG lifecycle hook weaveworks/weave#2970

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weave randomly picks ipv6 and everything breaks #45858

Weave randomly picks ipv6 and everything breaks #45858

hollowimage commented May 15, 2017 •

edited

cmluciano commented May 16, 2017

cmluciano commented May 16, 2017

k8s-ci-robot commented May 16, 2017

cmluciano commented May 16, 2017

hollowimage commented May 16, 2017

cmluciano commented May 16, 2017

bboreham commented May 16, 2017

cmluciano commented May 16, 2017

cmluciano commented May 16, 2017

cmluciano commented May 16, 2017

Weave randomly picks ipv6 and everything breaks #45858

Weave randomly picks ipv6 and everything breaks #45858

Comments

hollowimage commented May 15, 2017 • edited

cmluciano commented May 16, 2017

cmluciano commented May 16, 2017

k8s-ci-robot commented May 16, 2017

cmluciano commented May 16, 2017

hollowimage commented May 16, 2017

cmluciano commented May 16, 2017

bboreham commented May 16, 2017

cmluciano commented May 16, 2017

cmluciano commented May 16, 2017

cmluciano commented May 16, 2017

hollowimage commented May 15, 2017 •

edited