-
Notifications
You must be signed in to change notification settings - Fork 260
Description
Is this a request for help?: No
Is this an ISSUE or FEATURE REQUEST? (choose one): Issue
Which release version?: master + cherry-pick of #212
Which component (CNI/IPAM/CNM/CNS): CNI
Which Operating System (Linux/Windows): Windows Server version 1803
Which Orchestrator and version (e.g. Kubernetes, Docker): Kubernetes
What happened:
After scaling up a replica set, some containers failed to start. When this happens, the IPs were not freed. Here's an example of the end state after scaling back down. Only 1 pod IP should be in use on the node, but there are 3 marked as in use in the IPAM file
kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
psh-5d98ff98b5-qpbjv 1/1 Running 0 18h 10.240.0.141 k8s-linuxpool-13955535-1
whoami-1803-78fd64846f-lq9m7 1/1 Running 0 18h 10.240.0.99 13955k8s9001
# Run on 13955k8s9001
(get-content c:\k\azure-vnet-ipam.json | convertfrom-json).IPAM.AddressSpaces.local.Pools.'10.240.0.0/12'.Addresses
10.240.0.100 : @{ID=; Addr=10.240.0.100; InUse=False}
10.240.0.101 : @{ID=; Addr=10.240.0.101; InUse=False}
10.240.0.102 : @{ID=; Addr=10.240.0.102; InUse=False}
10.240.0.103 : @{ID=; Addr=10.240.0.103; InUse=False}
10.240.0.104 : @{ID=; Addr=10.240.0.104; InUse=False}
10.240.0.105 : @{ID=; Addr=10.240.0.105; InUse=False}
10.240.0.106 : @{ID=; Addr=10.240.0.106; InUse=False}
10.240.0.107 : @{ID=; Addr=10.240.0.107; InUse=False}
10.240.0.108 : @{ID=; Addr=10.240.0.108; InUse=False}
10.240.0.109 : @{ID=; Addr=10.240.0.109; InUse=False}
10.240.0.110 : @{ID=; Addr=10.240.0.110; InUse=False}
10.240.0.111 : @{ID=; Addr=10.240.0.111; InUse=False}
10.240.0.112 : @{ID=; Addr=10.240.0.112; InUse=False}
10.240.0.113 : @{ID=; Addr=10.240.0.113; InUse=True}
10.240.0.114 : @{ID=; Addr=10.240.0.114; InUse=False}
10.240.0.115 : @{ID=; Addr=10.240.0.115; InUse=False}
10.240.0.116 : @{ID=; Addr=10.240.0.116; InUse=False}
10.240.0.117 : @{ID=; Addr=10.240.0.117; InUse=False}
10.240.0.118 : @{ID=; Addr=10.240.0.118; InUse=False}
10.240.0.119 : @{ID=; Addr=10.240.0.119; InUse=False}
10.240.0.120 : @{ID=; Addr=10.240.0.120; InUse=False}
10.240.0.121 : @{ID=; Addr=10.240.0.121; InUse=False}
10.240.0.122 : @{ID=; Addr=10.240.0.122; InUse=False}
10.240.0.123 : @{ID=; Addr=10.240.0.123; InUse=False}
10.240.0.124 : @{ID=; Addr=10.240.0.124; InUse=False}
10.240.0.125 : @{ID=; Addr=10.240.0.125; InUse=True}
10.240.0.126 : @{ID=; Addr=10.240.0.126; InUse=False}
10.240.0.97 : @{ID=; Addr=10.240.0.97; InUse=False}
10.240.0.98 : @{ID=; Addr=10.240.0.98; InUse=False}
10.240.0.99 : @{ID=; Addr=10.240.0.99; InUse=True}
What you expected to happen:
No leaks
How to reproduce it (as minimally and precisely as possible):
# cordon all Windows nodes except 1
kubectl apply -f https://raw.githubusercontent.com/PatrickLang/Windows-K8s-Samples/master/HyperVExamples/whoami-1803.yaml
kubectl scale deploy whoami-1803 --replicas=6
# wait some time, not all 6 will start successfully
kubectl scale deploy whoami-1803 --replicas=1
Anything else we need to know:
Found this while testing fix for #195