This addresses a bug where deleting a completed pod would wind up
deleting the service map entry for the corresponding running pod on
the same node, due to the hostPort mapping key being the same for
the old and the new pod.
Ideally we want to validate whether the completed pod "owns" the
host port service, before deleting it, thus preventing breakage of
host port connectivity for any running pods with the same service
as frontend.
This commit adds such a validation.
Testing-
Automated (Control Plane):
This fix is captured by a control plane test case that does the
following:
1. Create a hostport pod, and terminate it's running containers to mark
it as "Completed".
2. Create another hostport pod using the same port as the "Completed" pod.
3. Delete the "Completed" pod, and verify that the hostport service has
not been deleted in the Datapath.
Manual Testing -
1. Add the GracefulNodeShutdown in the kubelet config on all nodes by
modifying the configuration in `/var/lib/kubelet/config.yaml`
```
featureGates:
GracefulNodeShutdown: true
shutdownGracePeriod: 30s
shutdownGracePeriodCriticalPods: 10s
```
2. Run `sudo systemctl restart kubelet` on each node to apply the kubelet
config change
3. Deploy an nginx web server with hostPort set, as well as a
nodeSelector, so pods get scheduled on the same node after node
restarts.
```
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
nodeSelector:
kubernetes.io/hostname: <node-name>
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
hostPort: 8081
```
4. Run `systemctl reboot` on the worker node to restart the machine.
5. After reboot spot the old pod in `Completed state`, while the new pod
is `Running`.
```
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-645797c867-8p2hp 0/1 Completed 0 13m
nginx-deployment-645797c867-dx2m8 1/1 Running 0 4m2s
```
6. `curl nodeIP:hostPort` successfully get the result.
7. Manually deleted the old pod which is in Completed state.
```
$ kubectl delete pod/nginx-deployment-645797c867-8p2hp
```
8. Redo the `curl nodeIP:hostPort`, and successfully get the result again. // hostPort service has been preserved.
Signed-off-by: Yash Shetty <yashshetty@google.com>