Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Remove deleted k8s nodes from Weave Net #2797

Closed
bboreham opened this issue Feb 14, 2017 · 38 comments · Fixed by #3149
Closed

Remove deleted k8s nodes from Weave Net #2797

bboreham opened this issue Feb 14, 2017 · 38 comments · Fixed by #3149
Assignees
Milestone

Comments

@bboreham
Copy link
Contributor

weave-kube adds all current nodes as peers at startup, but never checks back to see if some nodes have been deleted.

In a situation such as a regularly expanding and contracting auto-scale group, the IPAM ring will eventually become clogged with peers that have gone away.

We need to do weave rmpeer on deleted nodes, and less importantly weave forget. We will need some interlock to ensure the weave rmpeer is only done once.

How do we even detect that a Weave Net peer originated as a Kubernetes node? This logic should be resilient to users adding non-Kubernetes peers to the network, even on a host that was previously a Kubernetes peer.

@brb
Copy link
Contributor

brb commented Feb 17, 2017

@pmcq suggests (in #2807) to use a preStop hook for weave-kube Pod. The hook would run some steps from weave reset (the most important is DELETE /peers).

However, it seems that the hook will be run on every occasion when terminating even if k8s just restarts the Pod. This would disconnect all containers running on such node from the cluster. Also, it would open a window for IPAM races.

In addition, the hook won't be run if a machine was stopped non-gracefully.


One very complicated solution is to subscribe to k8s API service and run Paxos to elect a leader which would run rmpeer.

@bboreham
Copy link
Contributor Author

We don't need to run Paxos ourselves; Kubernetes is built on a fully-consistent store, which we can use for arbitrary purposes via annotations. Example code (that's using an annotation on a service; I guess we can use one on the DaemonSet)

@mikebryant
Copy link
Collaborator

What happens if you run rmpeer for the same node more than once? Does that actually break something, or does it just mean one of the rmpeer calls fails harmlessly?

@bboreham
Copy link
Contributor Author

bboreham commented Feb 23, 2017

weave rmpeer foo means "remove foo from the cluster and claim all of its IP address space on this node", so if you run it on two nodes close enough in time they will both claim foo's space and then the data structure is inconsistent. Hence, for an automated system, there has to be an interlock.

@rade
Copy link
Member

rade commented Feb 23, 2017

This is covered in the FAQ:

You cannot call weave rmpeer on more than one host. The address space, which was owned by the stale peer cannot be left dangling, and as a result it gets reassigned. In this instance, the address is reassigned to the peer on which weave rmpeer was run. Therefore, if you run weave forget and then weave rmpeer on more than one host at a time, it results in duplicate IPs on more than one host.

@mikebryant
Copy link
Collaborator

Oh, I see, thanks. Apologies, I didn't see that

@mikebryant
Copy link
Collaborator

mikebryant commented Feb 23, 2017

If anyone else is being hit by this on a prod cluster and wants an interim fix, this is what we hacked together today: https://gist.github.com/mikebryant/f5b25f9b14e5d6275ff0d3e934f73f12

Assumes all of your weave peers are Kubernetes nodes, and assumes none of your node names are strict prefixes of other node names

(Leader election by cloud-provider volume mounting)

@shamil
Copy link

shamil commented Jun 21, 2017

Is this something that planned? It became unmanageable. Every time I remove node from the cluster I need to rmpeer once and forget on each node (per each removed node).

Writing a script that does it is fine, but this is a very hacky, especially taking the fact that most of other kubernetes components are handling their duties by themselves.

@bboreham
Copy link
Contributor Author

Implementation is under way at #3022, but see the comments for some ugly complications.

Could you say why you call forget? It shouldn't be necessary, unless to avoid some noise in the logs.

@itskingori
Copy link

This bit us hard today. Our K8s masters all struggled getting up making out cluster unstable. One of the etcd peers was timing out trying access the other peer at an incorrect IP ... which mean all the masters were screwed.

Guided by #2797 (comment), we got into a weave pod and ran this:

#!/bin/bash

set -eu

# install kubectl
kubectl_version="1.6.7"
curl -o /usr/local/bin/kubectl "https://storage.googleapis.com/kubernetes-release/release/v${kubectl_version}/bin/linux/amd64/kubectl"
chmod +x /usr/local/bin/kubectl

# get list of nicknames from weave
curl -H "Accept: application/json" http://localhost:6784/report | jq -r .IPAM.Entries[].Nickname | sort | uniq > /tmp/nicknames

# get list of available nodes from kubernetes
kubectl get node -o custom-columns=name:.metadata.name --no-headers | xargs -n1 -I '{}' echo '{}' | cut -d'.' -f1 | sort > /tmp/node-names

# diff, basically what's unavailable
grep -F -x -v -f /tmp/node-names /tmp/nicknames > /tmp/nodes-unavailable

# rmpeer unavailable nodes
cat /tmp/nodes-unavailable | xargs -n 1 -I '{}' curl -H "Accept: application/json" -X DELETE 'http://localhost:6784/peer/{}'

This is what we had before:

/home/weave # ./weave --local status ipam
da:14:02:dd:7b:3c(ip-10-83-92-195)        8214 IPs (00.4% of total) (22 active)
6a:2c:f6:77:cf:91(ip-10-83-117-144)        128 IPs (00.0% of total) - unreachable!
a6:a6:ce:25:0c:58(ip-10-83-98-214)       16384 IPs (00.8% of total) - unreachable!
02:26:07:6b:3f:42(ip-10-83-103-200)      16384 IPs (00.8% of total) - unreachable!
f2:05:af:47:c5:92(ip-10-83-115-32)        8192 IPs (00.4% of total) - unreachable!
ba:36:17:bc:2e:0a(ip-10-83-50-12)         4096 IPs (00.2% of total) - unreachable!
2a:d8:15:8c:7d:33(ip-10-83-40-21)          128 IPs (00.0% of total) - unreachable!
be:06:02:f0:d1:96(ip-10-83-46-255)         384 IPs (00.0% of total) - unreachable!
fa:c2:ac:cf:17:df(ip-10-83-90-145)        2048 IPs (00.1% of total) - unreachable!
be:7e:d2:47:43:86(ip-10-83-92-216)          26 IPs (00.0% of total) - unreachable!
1e:c4:43:bd:59:e5(ip-10-83-89-20)        49152 IPs (02.3% of total) - unreachable!
92:07:d6:6d:75:83(ip-10-83-80-44)            3 IPs (00.0% of total) - unreachable!
2e:22:22:a1:7a:c1(ip-10-83-36-147)        8219 IPs (00.4% of total)
7e:82:5c:75:3f:3d(ip-10-83-114-142)        128 IPs (00.0% of total) - unreachable!
66:d7:7d:a9:8b:4a(ip-10-83-112-138)         32 IPs (00.0% of total) - unreachable!
1e:a5:00:85:4d:e8(ip-10-83-81-246)        4096 IPs (00.2% of total) - unreachable!
c2:7b:87:5f:15:e3(ip-10-83-67-231)       32768 IPs (01.6% of total) - unreachable!
f2:c6:e3:10:c4:05(ip-10-83-45-198)           5 IPs (00.0% of total) - unreachable!
2e:88:88:46:75:75(ip-10-83-127-236)         16 IPs (00.0% of total) - unreachable!
fe:eb:e5:a5:8e:95(ip-10-83-62-202)          16 IPs (00.0% of total)
16:dc:f6:a5:2a:db(ip-10-83-38-54)        32768 IPs (01.6% of total) - unreachable!
7e:1f:42:06:d0:21(ip-10-83-77-72)          128 IPs (00.0% of total) - unreachable!
06:cb:d3:f5:ef:8d(ip-10-83-68-140)        2048 IPs (00.1% of total) - unreachable!
de:a4:37:90:0d:84(ip-10-83-115-89)      393216 IPs (18.8% of total) - unreachable!
26:e0:ab:a4:74:0b(ip-10-83-68-162)         256 IPs (00.0% of total) - unreachable!
02:49:5c:88:7a:0d(ip-10-83-111-31)          64 IPs (00.0% of total) - unreachable!
8e:5d:49:84:a5:a9(ip-10-83-84-87)          128 IPs (00.0% of total) - unreachable!
a6:5a:77:b5:61:46(ip-10-83-84-116)        1024 IPs (00.0% of total) - unreachable!
4a:a6:cc:c8:13:a4(ip-10-83-125-169)        512 IPs (00.0% of total) - unreachable!
8a:5c:20:8d:15:f8(ip-10-83-68-157)       16405 IPs (00.8% of total)
5e:e0:60:22:50:7d(ip-10-83-98-247)         256 IPs (00.0% of total) - unreachable!
1a:0c:62:5f:86:6d(ip-10-83-48-66)          128 IPs (00.0% of total) - unreachable!
f6:ae:7c:aa:1c:3c(ip-10-83-63-48)           96 IPs (00.0% of total) - unreachable!
ba:b4:ed:9b:cf:63(ip-10-83-65-104)        8192 IPs (00.4% of total) - unreachable!
2a:bd:84:68:31:51(ip-10-83-76-199)         128 IPs (00.0% of total) - unreachable!
46:91:34:07:8e:90(ip-10-83-121-222)      49152 IPs (02.3% of total) - unreachable!
3a:24:a5:a6:64:ac(ip-10-83-102-251)     131072 IPs (06.2% of total) - unreachable!
66:20:7e:60:d8:07(ip-10-83-116-44)        4096 IPs (00.2% of total) - unreachable!
4e:fb:dd:99:97:d9(ip-10-83-99-3)          4096 IPs (00.2% of total) - unreachable!
fa:ab:cd:8c:18:a5(ip-10-83-114-159)          8 IPs (00.0% of total) - unreachable!
66:46:4f:c7:00:5f(ip-10-83-123-232)       1024 IPs (00.0% of total) - unreachable!
4e:cb:d4:93:41:c8(ip-10-83-109-130)         64 IPs (00.0% of total) - unreachable!
1e:74:28:eb:8f:06(ip-10-83-66-200)       32768 IPs (01.6% of total) - unreachable!
e2:34:d5:7e:b7:a9(ip-10-83-82-178)          64 IPs (00.0% of total) - unreachable!
3e:28:56:5c:2e:73(ip-10-83-105-188)        384 IPs (00.0% of total) - unreachable!
f6:d3:e5:2e:5c:9d(ip-10-83-102-39)        8196 IPs (00.4% of total)
8a:13:b7:4b:31:06(ip-10-83-76-105)          32 IPs (00.0% of total) - unreachable!
12:97:7f:b8:69:12(ip-10-83-125-85)         522 IPs (00.0% of total)
8a:9b:c1:bd:33:d2(ip-10-83-68-64)          283 IPs (00.0% of total)
36:82:78:ca:d8:a6(ip-10-83-32-246)         128 IPs (00.0% of total) - unreachable!
a6:ec:79:8d:df:e4(ip-10-83-76-83)           64 IPs (00.0% of total) - unreachable!
e6:0d:e7:33:19:b8(ip-10-83-114-144)         32 IPs (00.0% of total) - unreachable!
f6:a6:19:34:7f:01(ip-10-83-76-208)         513 IPs (00.0% of total)
6a:81:32:7c:47:8f(ip-10-83-115-223)         64 IPs (00.0% of total) - unreachable!
82:29:b3:47:2a:dc(ip-10-83-105-141)        512 IPs (00.0% of total) - unreachable!
2e:0d:d5:2c:60:49(ip-10-83-93-139)          16 IPs (00.0% of total) - unreachable!
5a:38:28:74:39:09(ip-10-83-44-56)          768 IPs (00.0% of total) - unreachable!
02:ca:92:6d:0b:49(ip-10-83-69-142)          32 IPs (00.0% of total) - unreachable!
32:b6:fe:6d:f5:53(ip-10-83-34-112)      720896 IPs (34.4% of total) - unreachable!
6e:0f:10:25:b4:d0(ip-10-83-66-64)         6144 IPs (00.3% of total) - unreachable!
5e:ba:58:ee:b1:4b(ip-10-83-100-132)          6 IPs (00.0% of total) - unreachable!
6e:6a:25:28:a4:0f(ip-10-83-32-97)           64 IPs (00.0% of total) - unreachable!
d6:74:d8:c3:76:9a(ip-10-83-76-8)           256 IPs (00.0% of total) - unreachable!
4e:82:65:f2:58:42(ip-10-83-111-20)       32768 IPs (01.6% of total) - unreachable!
3e:b7:04:00:d4:f2(ip-10-83-106-240)          1 IPs (00.0% of total) - unreachable!
6e:83:f0:c0:a9:b8(ip-10-83-85-120)          32 IPs (00.0% of total) - unreachable!
42:f0:d4:06:4e:7d(ip-10-83-68-156)          24 IPs (00.0% of total) - unreachable!
e6:5c:cc:12:1a:bc(ip-10-83-116-20)          32 IPs (00.0% of total) - unreachable!
da:a2:cb:69:82:e9(ip-10-83-104-165)      32768 IPs (01.6% of total) - unreachable!
72:80:f4:5a:0c:11(ip-10-83-34-65)         3072 IPs (00.1% of total) - unreachable!
8a:a4:b9:4f:ad:49(ip-10-83-82-106)           4 IPs (00.0% of total) - unreachable!
0a:cf:fc:47:e9:17(ip-10-83-84-163)         128 IPs (00.0% of total) - unreachable!
2e:a6:7b:93:88:78(ip-10-83-35-94)        16384 IPs (00.8% of total) - unreachable!
fa:81:f9:33:18:e5(ip-10-83-115-220)        128 IPs (00.0% of total) - unreachable!
a2:43:ca:64:17:a2(ip-10-83-83-38)       196608 IPs (09.4% of total) - unreachable!
f2:dc:c2:14:25:59(ip-10-83-73-117)          16 IPs (00.0% of total) - unreachable!
6a:fa:ce:5d:14:3d(ip-10-83-103-152)          1 IPs (00.0% of total) - unreachable!
e6:4f:3b:d8:b6:1e(ip-10-83-101-153)       2048 IPs (00.1% of total) - unreachable!
46:49:f4:b9:18:94(ip-10-83-117-26)        4096 IPs (00.2% of total) - unreachable!
0a:9e:de:c4:f9:69(ip-10-83-59-150)       49152 IPs (02.3% of total) - unreachable!
2a:58:df:a1:9a:b1(ip-10-83-92-126)        1024 IPs (00.0% of total) - unreachable!
ae:20:1b:df:2b:14(ip-10-83-51-31)           32 IPs (00.0% of total) - unreachable!
22:ae:03:ac:b7:de(ip-10-83-65-204)        1024 IPs (00.0% of total) - unreachable!
7a:b3:89:22:05:78(ip-10-83-44-203)         763 IPs (00.0% of total)
d6:d7:8d:fe:c9:4b(ip-10-83-124-62)         256 IPs (00.0% of total) - unreachable!
8e:63:7a:cf:c4:08(ip-10-83-55-154)          32 IPs (00.0% of total) - unreachable!
1a:04:58:59:b9:70(ip-10-83-72-144)          64 IPs (00.0% of total) - unreachable!
f2:d6:c3:39:67:61(ip-10-83-43-79)           16 IPs (00.0% of total) - unreachable!
de:a0:14:57:16:71(ip-10-83-101-185)       8212 IPs (00.4% of total)
7e:41:70:af:ae:d8(ip-10-83-35-75)          256 IPs (00.0% of total) - unreachable!
e2:89:7e:dd:57:f5(ip-10-83-102-35)        4096 IPs (00.2% of total) - unreachable!
7a:0e:89:2b:68:a1(ip-10-83-60-172)          64 IPs (00.0% of total) - unreachable!
4e:94:5c:4d:5e:de(ip-10-83-91-195)           8 IPs (00.0% of total) - unreachable!
6e:4d:e1:dd:c7:d8(ip-10-83-92-146)      131072 IPs (06.2% of total) - unreachable!
0e:2a:a2:0d:38:5c(ip-10-83-79-230)          32 IPs (00.0% of total) - unreachable!
1a:9e:26:4a:a6:51(ip-10-83-68-119)         512 IPs (00.0% of total) - unreachable!
ca:dd:86:02:25:51(ip-10-83-61-104)         256 IPs (00.0% of total) - unreachable!
0e:b5:e3:1f:b9:f0(ip-10-83-37-205)           4 IPs (00.0% of total) - unreachable!
96:e6:fd:76:f7:c1(ip-10-83-107-140)       2048 IPs (00.1% of total) - unreachable!
ee:00:45:23:9f:d4(ip-10-83-79-61)         3072 IPs (00.1% of total) - unreachable!
8a:10:e6:e9:f9:43(ip-10-83-92-250)        1024 IPs (00.0% of total) - unreachable!
5a:65:ea:d1:6d:27(ip-10-83-83-123)        2048 IPs (00.1% of total) - unreachable!
26:93:7d:fa:12:e0(ip-10-83-119-159)      32768 IPs (01.6% of total) - unreachable!
8a:d1:11:50:94:d6(ip-10-83-50-167)           3 IPs (00.0% of total) - unreachable!
02:e1:63:27:96:af(ip-10-83-124-250)         64 IPs (00.0% of total) - unreachable!
9e:ef:0b:28:b6:e1(ip-10-83-121-202)       4096 IPs (00.2% of total) - unreachable!
12:d0:13:4a:65:32(ip-10-83-46-23)           20 IPs (00.0% of total) - unreachable!

And now we have 😰:

/home/weave # ./weave --local status ipam
da:14:02:dd:7b:3c(ip-10-83-92-195)     2054023 IPs (97.9% of total) (22 active)
8a:5c:20:8d:15:f8(ip-10-83-68-157)       16405 IPs (00.8% of total)
f6:d3:e5:2e:5c:9d(ip-10-83-102-39)        8196 IPs (00.4% of total)
7a:b3:89:22:05:78(ip-10-83-44-203)         763 IPs (00.0% of total)
f6:a6:19:34:7f:01(ip-10-83-76-208)         513 IPs (00.0% of total)
12:97:7f:b8:69:12(ip-10-83-125-85)         522 IPs (00.0% of total)
8a:9b:c1:bd:33:d2(ip-10-83-68-64)          283 IPs (00.0% of total)
de:a0:14:57:16:71(ip-10-83-101-185)       8212 IPs (00.4% of total)
2e:22:22:a1:7a:c1(ip-10-83-36-147)        8219 IPs (00.4% of total)
fe:eb:e5:a5:8e:95(ip-10-83-62-202)          16 IPs (00.0% of total)

@itskingori
Copy link

@bboreham based on the result in #2797 (comment) ... what are the implications of this ... 97.9% of total on one host.

da:14:02:dd:7b:3c(ip-10-83-92-195)     2054023 IPs (97.9% of total) (22 active)

@bboreham
Copy link
Contributor Author

That is what you achieved by reclaiming all the "unreachable" space on that one peer. As other peers run out of space, or start anew, they will request space from that one.

It's an ok state to be in, unless you shut down that peer for good without telling Weave Net, in which case you will be back in the previous situation.

@itskingori
Copy link

@bboreham great ... thanks for the explanation.

@caarlos0
Copy link
Contributor

caarlos0 commented Sep 8, 2017

After removing the peers, I now have this:

/home/weave # ./weave --local status ipam
1a:a5:74:30:7c:3d(ip-10-10-201-41)           1 IPs (00.0% of total) (1 active)
ee:bb:0c:34:ab:31(ip-10-10-201-254)       2048 IPs (00.2% of total) - unreachable!
52:60:b2:bd:1a:a3(ip-10-10-200-19)       49152 IPs (04.7% of total) - unreachable!
66:42:12:77:e0:d2(ip-10-10-200-33)       65536 IPs (06.2% of total) - unreachable!
f6:34:d7:e3:af:c9(ip-10-10-201-185)      32768 IPs (03.1% of total) - unreachable!
5a:9f:68:70:ca:de(ip-10-10-201-122)     736103 IPs (70.2% of total) - unreachable!
8e:63:bc:56:4c:ca(ip-10-10-200-46)         512 IPs (00.0% of total) - unreachable!
72:a9:cc:8a:c5:3b(ip-10-10-200-125)       4096 IPs (00.4% of total) - unreachable!
a2:b6:29:8b:be:b7(ip-10-10-200-97)      131072 IPs (12.5% of total) - unreachable!
c2:1a:8b:80:ab:4c(ip-10-10-200-14)         512 IPs (00.0% of total) - unreachable!
ea:43:20:d6:aa:c8(ip-10-10-200-200)       8192 IPs (00.8% of total) - unreachable!
d2:45:4c:6c:e9:d6(ip-10-10-201-125)          8 IPs (00.0% of total) - unreachable!
6a:9b:f8:c5:11:1e(ip-10-10-201-116)         16 IPs (00.0% of total) - unreachable!
2a:5c:6c:40:2b:fb(ip-10-10-201-122)        128 IPs (00.0% of total) - unreachable!
2e:5e:65:75:fc:71(ip-10-10-200-83)       16384 IPs (01.6% of total) - unreachable!
7e:f5:cb:0b:3e:a3(ip-10-10-201-109)       1024 IPs (00.1% of total) - unreachable!
9e:68:a6:65:ee:1d(ip-10-10-200-22)        1024 IPs (00.1% of total) - unreachable!

what are the implications of that? Should I be worried?

@caarlos0
Copy link
Contributor

caarlos0 commented Sep 9, 2017

It turns out I should, things are not working :D

@caarlos0
Copy link
Contributor

caarlos0 commented Sep 9, 2017

BTW: I had to remove all things related to weave and re-create it.

Domains were not being resolved anymore to the outside world. Not sure if it is related to this problem or not.

This was a fun Friday night.

@itskingori
Copy link

@caarlos0 99.9% of your cluster was unreachable ...

0.2 + 4.7 + 6.2 + 3.1 + 70.2 + 0.4 + 12.5 + 0.8 + 1.6 + 0.1+ 0.1 = 99.9

@mikebryant shared how he recovered from this in #2797 (comment). I've tried to make his solution clearer in #2797 (comment). @bboreham explains what's happening in #2797 (comment).

@caarlos0
Copy link
Contributor

caarlos0 commented Sep 9, 2017

@itskingori yeah, this all-unreacheable state was after I removed the peers that didn't exist anymore using the scripts provided.

I did that, pods started to launch again, but DNS to the "outside world" inside the containers wasn't working.

Because pods keep restarting, my entire cluster entered in a broken state, where nodes were failing with ContainerGCFailed/ImageGCFailed and nothing worked anymore.

Ultimately, I had terminate all nodes, remove the weave daemon-set and re-create it (also upgrading from 1.9.4 to 2.0.4).

@bricef
Copy link
Contributor

bricef commented Oct 25, 2017

Just an update for everyone here. @bboreham's #3022 fix has been rebased and tested and it's in review at #3149. Should make it to mainline soon.

@natewarr
Copy link

natewarr commented Nov 1, 2017

@bboreham It seems that reclaiming IP addresses to a single node creates a potential single-point-of failure. When the node that has all of the IPAM allocations dies, how do they get reclaimed? Sure, if we are constantly running the script, they would theoretically be reclaimed to another running node once k8s realized the script's pod is no longer up, and reschedules it. This could take several precarious seconds, assuming all the pieces fall correctly.

Wouldn't it be better if the IPs were allocated evenly across the cluster instead of hoarded by one node? Is there a way to do this?

@bboreham
Copy link
Contributor Author

bboreham commented Nov 1, 2017

IP ownership is generally evenly spread, so at any one time we would be reclaiming some fraction of all IPs from those nodes that have gone away without telling us. Re-running the reclaim periodically, instead of just when a node starts, is a worthwhile improvement to reduce the window.

You've certainly identified an edge case, @natewarr, but I think I'd want to see evidence that it can happen for real before making the implementation much more complicated.

@natewarr
Copy link

natewarr commented Nov 2, 2017

I probably need to change my name to "TheEdgeCase". We will find an acceptable workaround. Thanks for your work on this bug!

@natewarr
Copy link

natewarr commented Nov 2, 2017

Any suggestions on this situation? It looks like a new node got spun up on the same IP as the old node, and that happens to be the node which we are running rmpeers.

0e:1c:47:1d:8f:0b(ip-172-30-3-14)       671746 IPs (64.1% of total) (9 active)
f6:28:bd:1b:f2:16(ip-172-30-3-19)         8192 IPs (00.8% of total)
12:1e:f8:cc:46:61(ip-172-30-2-48)        49153 IPs (04.7% of total)
1a:99:77:49:5f:cf(ip-172-30-2-29)        32768 IPs (03.1% of total)
7e:58:ab:e7:6d:5f(ip-172-30-2-56)        32768 IPs (03.1% of total)
82:05:7d:62:71:de(ip-172-30-3-14)        65533 IPs (06.2% of total) - unreachabl
e!
82:23:bd:9b:89:b1(ip-172-30-2-16)        24576 IPs (02.3% of total)
ea:1f:35:fb:b8:61(ip-172-30-2-37)        49152 IPs (04.7% of total)
ca:ad:8f:66:32:1c(ip-172-30-2-58)        32768 IPs (03.1% of total)
32:92:fc:78:08:df(ip-172-30-3-18)        16384 IPs (01.6% of total)
ee:e3:80:f7:d2:2b(ip-172-30-3-24)        32768 IPs (03.1% of total)
da:ab:47:b9:80:e0(ip-172-30-2-28)        32768 IPs (03.1% of total)

@bboreham
Copy link
Contributor Author

bboreham commented Nov 6, 2017

@natewarr I can see that AWS is re-using IPs and hence hostnames; Weave Net internally works off the "unique peer ID" which is generated in various ways

I am unclear what you need suggestions for, sorry.

@caarlos0
Copy link
Contributor

caarlos0 commented Nov 6, 2017

FWIW this issue with peers unreachable happens a lot more if you have an elastic cluster (obviously, considering more instances are launched and terminated).

@natewarr
Copy link

natewarr commented Nov 6, 2017

I see. I was getting hung up on the use of the Nicknames as used in the gist hack. I was able to run this to reclaim that peer with the duplicate IP.

curl -H "Accept: application/json" -X DELETE 'http://localhost:6784/peer/82:05:7d:62:71:de'

@bboreham
Copy link
Contributor Author

bboreham commented Nov 6, 2017

@natewarr ok, that makes perfect sense. So my suggestion would have been to drop to the peer-id (the hex number that looks like a mac address), which you did 😄

@bboreham
Copy link
Contributor Author

Please note the code to remove deleted Kubernetes peers from a cluster was released today, in Weave Net version 2.1.1.

@alok87
Copy link
Contributor

alok87 commented Jan 28, 2018

@bboreham @natewarr we are also running the older version of weave and were facing this issue in staging cluster, so we ran the script for prod cluster just to be safe(as there were many unreachable ips there also). But now the 87.9% IPs are present in a single node. How to avoid this. As this node going down would recreate the problem.

ca:5d:67:8a:1b:f9(ip-10-0-21-172.ap-southeast-1.compute.internal)  1843200 IPs (87.9% of total) (25 active)
3e:26:65:4d:b4:52(ip-10-0-21-187.ap-southeast-1.compute.internal)      512 IPs (00.0% of total)```

@natewarr
Copy link

@alok87 thats as far as the workaround they wrote up will get you. the other nodes can request this node share with them as needed, so its not technically an error condition. If you lose that node without reclaiming them somehow, you are back in the error condition. I imagine the weave guys will just tell you to update past 2.1.1.

@itskingori
Copy link

@alok87 try spread out the clearing to different instances of weave (the script just clears all from one weave ... and the weave that runs the clearing claims the cleared IPs). Fundamental command is mentioned in #2797 (comment).

@caarlos0
Copy link
Contributor

caarlos0 commented Mar 19, 2018

Still having this issue in a new kubernetes 1.8.8 cluster (launched with kops 1.8) and weave 2.2.0 on a pre-existing VPC.

The cluster was still small (went from 3 nodes to 2 nodes - plus master, so, 4 to 3 in total) - maybe that is the reason (not enough instances to the quorum)?

On the other hand, this also still happens on two old kubernetes 1.5.x clusters running weave 2.2.0 on the same VPC. Those clusters have more nodes - one has between 6 and 10, commonly 7/8 and the other between 5 and 8, commonly 6.

All the 3 clusters run cluster-autoscaler (different versions due to kubernetes version limitations).

Is there something I should be looking at? Any guesses on what's the reason for this problem?

@bboreham
Copy link
Contributor Author

@caarlos0 please can you open a new issue detailing what you are seeing. There is no minimum number of nodes, but note the cleanup in #3149 only runs when a weave container starts.

@caarlos0
Copy link
Contributor

ohh, so that's it, a node go down, the cleanup doesn't run until a new node is up... OK then, no issue 👍

thanks

@sstarcher
Copy link

I have seen this same issue with weave 2.2.0, kops 1.9.0-beta-2, k8s 1.9.3

@sean-krail
Copy link

I'm also seeing this issue with weave 2.3.0, kops 1.9.0, k8s 1.9.3

@bboreham
Copy link
Contributor Author

Please don’t comment on old, closed issues. Open a new issue and provide the details which will allow your issue to be debugged.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.