New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The product_uuid and the hostname should be unique across nodes #31

Closed
mikedanese opened this Issue Nov 22, 2016 · 12 comments

Comments

Projects
None yet
5 participants
@mikedanese
Member

mikedanese commented Nov 22, 2016

From @vganapathy1 on October 26, 2016 6:50

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"clean", BuildDate:"2016-10-21T02:48:38Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.1", GitCommit:"33cf7b9acbb2cb7c9c72a10d6636321fb180b159", GitTreeState:"clean", BuildDate:"2016-10-10T18:13:36Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
`root@Ubuntu14041-mars08:/# kubeadm version
kubeadm version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.1.409+714f816a349e79", GitCommit:"714f816a349e7978bc93b35c67ce7b9851e53a6f", GitTreeState:"clean", BuildDate:"2016-10-17T13:01:29Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
`

Environment:
VMware vCloud Air VM

  • Cloud provider or hardware configuration:
    7.797 GiB RAM, 2 CPUs each
  • OS (e.g. from /etc/os-release):
 NAME="Ubuntu"
 VERSION="16.04.1 LTS (Xenial Xerus)"
  • Kernel (e.g. uname -a):
    Kernel Version: 4.4.0-43-generic
  • Install tools:
  • Others:

What happened:
We used kubeadm and the procedure in Installing Kubernetes on Linux with kubeadm and for the most part the installation went well.

weve-cube installation failed with the Peer name collision,

INFO: 2016/10/26 05:24:32.585405 ->[10.63.33.46:6783] attempting connection INFO: 2016/10/26 05:24:32.587778 ->[10.63.33.46:6783|72:47:96:69:16:bb(Ubuntu14041-mars09)]: connection shutting down due to error: local "72:47:96:69:16:bb(Ubuntu14041-mars09)" and remote "72:47:96:69:16:bb(Ubuntu14041-mars08)" peer names collision

What you expected to happen:
weve-cube installation should have successful and brought the kube-dns up!

How to reproduce it (as minimally and precisely as possible):

On master node:
kubeadm init --api-advertise-addresses=$IP

On Node:
kubeadm join --token $actualtoken $IP

Installed wave-cube as below,
`# kubectl apply -f https://git.io/weave-kube

daemonset "weave-net" created`

kube-dns didn't not start as expected,

Both master and node gets assigned with the same HWaddr causing name collision
` on Master
docker logs a65253346635
INFO: 2016/10/26 05:24:20.719919 Command line options: map[ipalloc-range:10.32.0.0/12 nickname:Ubuntu14041-mars08 no-dns:true docker-api: http-addr:127.0.0.1:6784 ipalloc-init:consensus=2 datapath:datapath name:72:47:96:69:16:bb port:6783]
INFO: 2016/10/26 05:24:20.730839 Communication between peers is unencrypted.
INFO: 2016/10/26 05:24:20.971010 Our name is 72:47:96:69:16:bb(Ubuntu14041-mars08)

On Node,
docker logs a65253346635
INFO: 2016/10/26 05:23:39.312294 Command line options: map[datapath:datapath ipalloc-range:10.32.0.0/12 name:72:47:96:69:16:bb port:6783 docker-api: http-addr:127.0.0.1:6784 ipalloc-init:consensus=2 nickname:Ubuntu14041-mars09 no-dns:true]
INFO: 2016/10/26 05:23:39.314095 Communication between peers is unencrypted.
INFO: 2016/10/26 05:23:39.323302 Our name is 72:47:96:69:16:bb(Ubuntu14041-mars09)
`

CUrling kube-apiserver from a node:
root@Ubuntu14041-mars09:~# curl -k https://10.0.0.1 Unauthorized

nslookup on both master & node
root@Ubuntu14041-mars08:/# curl -k https://10.0.0.1 Unauthorized root@Ubuntu14041-mars08:/# nslookup kubernetes.default Server: 10.30.48.37 Address: 10.30.48.37#53 ** server can't find kubernetes.default: NXDOMAIN

Anything else do we need to know:
iptables -S


`root@Ubuntu14041-mars08:/# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION
-N KUBE-FIREWALL
-N KUBE-SERVICES
-N WEAVE-NPC
-N WEAVE-NPC-DEFAULT
-N WEAVE-NPC-INGRESS
-A INPUT -j KUBE-FIREWALL
-A INPUT -d 172.17.0.1/32 -i docker0 -p tcp -m tcp --dport 6783 -j DROP
-A INPUT -d 172.17.0.1/32 -i docker0 -p udp -m udp --dport 6783 -j DROP
-A INPUT -d 172.17.0.1/32 -i docker0 -p udp -m udp --dport 6784 -j DROP
-A INPUT -i docker0 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -i docker0 -p tcp -m tcp --dport 53 -j ACCEPT
-A FORWARD -i docker0 -o weave -j DROP
-A FORWARD -j DOCKER-ISOLATION
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -o weave -j WEAVE-NPC
-A FORWARD -o weave -j LOG --log-prefix "WEAVE-NPC:"
-A FORWARD -o weave -j DROP
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A DOCKER-ISOLATION -j RETURN
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-SERVICES -d 10.0.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp --dport 53 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 10.0.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp --dport 53 -j REJECT --reject-with icmp-port-unreachable
-A WEAVE-NPC -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-DEFAULT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-INGRESS
-A WEAVE-NPC-DEFAULT -m set --match-set weave-k?Z;25^M}|1s7P3|H9i;*;MhG dst -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-iuZcey(5DeXbzgRFs8Szo]<@p dst -j ACCEPT
`

kube-proxy-amd logs had the following entries,



`I1026 06:39:50.083990       1 iptables.go:339] running iptables-restore [--noflush --counters]
I1026 06:39:50.093036       1 proxier.go:751] syncProxyRules took 58.207586ms
I1026 06:39:50.093063       1 proxier.go:523] OnEndpointsUpdate took 58.262934ms for 4 endpoints
I1026 06:39:50.970922       1 config.go:99] Calling handler.OnEndpointsUpdate()
I1026 06:39:50.974755       1 proxier.go:758] Syncing iptables rules
I1026 06:39:50.974769       1 iptables.go:362] running iptables -N [KUBE-SERVICES -t filter]
I1026 06:39:50.976635       1 healthcheck.go:86] LB service health check mutation request Service: default/kubernetes - 0 Endpoints []
I1026 06:39:50.978146       1 iptables.go:362] running iptables -N [KUBE-SERVICES -t nat]
I1026 06:39:50.980501       1 iptables.go:362] running iptables -C [OUTPUT -t filter -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I1026 06:39:50.982778       1 iptables.go:362] running iptables -C [OUTPUT -t nat -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I1026 06:39:50.984762       1 iptables.go:362] running iptables -C [PREROUTING -t nat -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I1026 06:39:50.986536       1 iptables.go:362] running iptables -N [KUBE-POSTROUTING -t nat]
I1026 06:39:50.988244       1 iptables.go:362] running iptables -C [POSTROUTING -t nat -m comment --comment kubernetes postrouting rules -j KUBE-POSTROUTING]
I1026 06:39:50.990022       1 iptables.go:298] running iptables-save [-t filter]
I1026 06:39:50.992581       1 iptables.go:298] running iptables-save [-t nat]
I1026 06:39:50.995184       1 proxier.go:1244] Restoring iptables rules: *filter
:KUBE-SERVICES - [0:0]
-A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp -p udp -d 10.0.0.10/32 --dport 53 -j REJECT
-A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp -p tcp -d 10.0.0.10/32 --dport 53 -j REJECT
COMMIT
`
 @errordeveloper, please refer the previous conversations,
#https://github.com/kubernetes/kubernetes/issues/34884

Copied from original issue: kubernetes/kubernetes#35591

@mikedanese

This comment has been minimized.

Member

mikedanese commented Nov 22, 2016

From @luxas on October 26, 2016 16:49

cc @kubernetes/sig-cluster-lifecycle

@mikedanese

This comment has been minimized.

Member

mikedanese commented Nov 22, 2016

From @soualid on November 13, 2016 22:34

Got the following error - that may be related - in weave-kube when trying to setup a bare metal 3 nodes cluster running ubuntu 16 with kubeadm, got the same HWAddr assigned on master and worker node :

INFO: 2016/11/13 22:09:06.134487 ->[163.172.221.165:35727|ca:dd:16:be:df:42(sd-110872)]: connection shutting down due to error: local "ca:dd:16:be:df:42(sd-110872)" and remote "ca:dd:16:be:df:42(sd-100489)" peer names collision

Is there a way to force the renewal of the HWAddr ?


Setup informations :

kubeadm version

kubeadm version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.2.421+a6bea3d79b8bba", GitCommit:"a6bea3d79b8bbaa5e8b57482c9fff9265d402708", GitTreeState:"clean", BuildDate:"2016-11-03T06:54:50Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}

kubectl version

Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"clean", BuildDate:"2016-10-21T02:48:38Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"clean", BuildDate:"2016-10-21T02:42:39Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

/etc/os-release

NAME="Ubuntu"
VERSION="16.04.1 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.1 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

uname -a

Linux sd-110872 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

docker -v

Docker version 1.12.1, build 23cf638

@mikedanese

This comment has been minimized.

Member

mikedanese commented Nov 22, 2016

From @vganapathy1 on November 15, 2016 15:38

@soualid Surprisingly, the UUID was same for all the VM's I had, which caused the name collision and changing the UUID resolved the issue.

To get UUID,
cat /sys/class/dmi/id/product_uuid

I had to get the UUID changed in all the VM's to get this work. If that doesn't work for you, you can check with @errordeveloper and he had a wave-kube patch which also worked for me.

@mikedanese

This comment has been minimized.

Member

mikedanese commented Nov 22, 2016

From @soualid on November 15, 2016 21:7

@vganapathy1 thanks, I confirm that the machine UUID are equals on my machines, the boxes provider (online.net) must use a "clone" install system ("symantec ghost" like) that is not changing this UUID properly between boxes. I will contact them about this issue, but it could be great to be able to workaround this issue by overriding this value at runtime through a kubeadm parameter.

Thank you !

@luxas luxas changed the title from wave-cube DNS name collision and kube-dns dosn't start to The product_uuid and the hostname should be unique across nodes Nov 25, 2016

@davidcomeyne

This comment has been minimized.

davidcomeyne commented Jan 26, 2017

Is there a solution/workaround for this yet?

I read about changing the UUID, how exactly should I do that?

@rhuss

This comment has been minimized.

rhuss commented Apr 4, 2017

I ran in the very same issue with a raspi cluster running on Hypriot, all my nodes get the same HW address assigned in Weave:

k logs weave-net-x1z25 --namespace=kube-system weave
INFO: 2017/04/04 06:16:06.910766 Command line options: map[http-addr:127.0.0.1:6784 ipalloc-init:consensus=4 nickname:n3 status-addr:0.0.0.0:6782 docker-api: conn-limit:30 datapath:datapath ipalloc-range:10.32.0.0/12 no-dns:true port:6783]
INFO: 2017/04/04 06:16:06.911597 Communication between peers is unencrypted.
INFO: 2017/04/04 06:16:07.062159 Our name is 8e:0e:19:5d:4e:5e(n3)
INFO: 2017/04/04 06:16:07.062426 Launch detected - using supplied peer list: [192.168.23.200 192.168.23.201 192.168.23.202 192.168.23.203]
INFO: 2017/04/04 06:16:07.062669 Checking for pre-existing addresses on weave bridge
INFO: 2017/04/04 06:16:07.072861 [allocator 8e:0e:19:5d:4e:5e] No valid persisted data
INFO: 2017/04/04 06:16:07.158094 [allocator 8e:0e:19:5d:4e:5e] Initialising via deferred consensus
INFO: 2017/04/04 06:16:07.159120 Sniffing traffic on datapath (via ODP)
INFO: 2017/04/04 06:16:07.163257 ->[192.168.23.202:6783] attempting connection
INFO: 2017/04/04 06:16:07.163770 ->[192.168.23.201:6783] attempting connection
INFO: 2017/04/04 06:16:07.164761 ->[192.168.23.203:6783] attempting connection
INFO: 2017/04/04 06:16:07.165369 ->[192.168.23.200:6783] attempting connection
INFO: 2017/04/04 06:16:07.165999 ->[192.168.23.203:48229] connection accepted
INFO: 2017/04/04 06:16:07.173375 ->[192.168.23.203:6783|8e:0e:19:5d:4e:5e(n3)]: connection shutting down due to error: cannot connect to ourself
INFO: 2017/04/04 06:16:07.174156 ->[192.168.23.203:48229|8e:0e:19:5d:4e:5e(n3)]: connection shutting down due to error: cannot connect to ourself
INFO: 2017/04/04 06:16:07.185573 ->[192.168.23.202:6783|8e:0e:19:5d:4e:5e(n3)]: connection shutting down due to error: local "8e:0e:19:5d:4e:5e(n3)" and remote "8e:0e:19:5d:4e:5e(n2)" peer names collision
INFO: 2017/04/04 06:16:07.189360 Listening for HTTP control messages on 127.0.0.1:6784
@jamiehannaford

This comment has been minimized.

Member

jamiehannaford commented Jun 16, 2017

I think all we can do from a kubeadm perspective is document unique product UUIDs as a potential requirement. I imagine there are so many different ways to resolving this per OS, we can't really suggest a specific one.

@luxas

This comment has been minimized.

Member

luxas commented Jun 17, 2017

Agreed @jamiehannaford, we should just list this as a requirement for everything running smoothly.
Kubernetes and things running on top might require/assume that
a) The product_uuid is unique
b) The MAC address is unique
for every node.

@jamiehannaford Can you document that please?

@jamiehannaford

This comment has been minimized.

Member

jamiehannaford commented Jun 17, 2017

@luxas Sure, I'll get to it next week

@luxas

This comment has been minimized.

Member

luxas commented Jun 17, 2017

Perfect, thank you!

@jamiehannaford

This comment has been minimized.

Member

jamiehannaford commented Jun 26, 2017

This now documented, so we can close 🎉

@luxas

This comment has been minimized.

Member

luxas commented Jun 26, 2017

Yayy 🎉

@luxas luxas closed this Jun 26, 2017

steveperry-53 added a commit to kubernetes/website that referenced this issue Oct 13, 2017

adding explanation for product_uuid uniqueness (#5885)
added steps for how to verify macaddress and product_uuid with
reference to the bug in github that identified this need originally
(kubernetes/kubeadm#31)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment