New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate "ownership" of hostPort service being deleted #22587
Conversation
Commit 704e71af6c06ac5edc5dae6f0babf71d4e012466 does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
Commits 704e71af6c06ac5edc5dae6f0babf71d4e012466, 16b827c2c54f8ff20a20658a61f685d2bbc6bd67 do not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
16b827c
to
3dddd15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR. What happens in case there is only one pod running on the node and that pod is removed? IF we are skipping the generation of service mappings will the entries in the bpf maps be deleted?
3dddd15
to
46beb5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 The fix doesn't look right to me, see the inline comment.
If the single pod on the node is in |
One possible option - though it's a pretty significant refactor - is to manage HostPort mappings via CNI capability arguments rather than via pod objects. This would then serialize hostport mappings via the kubelet. |
I also wonder if we can elide the pod deletion entirely if the pod is newly "Completed" when we first see it? |
@squeed IIUC, the issue isn't really caused by a race condition, so serializing the hostport mappings wouldn't help us I believe. I think what we really need is a way to know if the hostPort mappings "belong" to the "completed" pod being deleted or not. One way to do that would be to inspect the hostPort mappings for the service in question, and verifying that the IP for the "completed" pod is one of the backends. But I wasn't able to find any APIs to do a simple lookup for a service's hostPort mappings to verify this ( |
Commit b5e248597f70f568cc41b78a9bcbd79bbf61a9a0 does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
b5e2485
to
cf7dc17
Compare
Commit b5e248597f70f568cc41b78a9bcbd79bbf61a9a0 does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
cf7dc17
to
5ee2ca0
Compare
eef7982
to
9a6623e
Compare
lgtm, two nits only about the the empty line in the end of two files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! LGTM overall, please add the missing unit test, otherwise it's good to go!
This addresses a bug where deleting a completed pod would wind up deleting the service map entry for the corresponding running pod on the same node, due to the hostPort mapping key being the same for the old and the new pod. Ideally we want to validate whether the completed pod "owns" the host port service, before deleting it, thus preventing breakage of host port connectivity for any running pods with the same service as frontend. This commit adds such a validation. Testing- Automated (Control Plane): This fix is captured by a control plane test case that does the following: 1. Create a hostport pod, and terminate it's running containers to mark it as "Completed". 2. Create another hostport pod using the same port as the "Completed" pod. 3. Delete the "Completed" pod, and verify that the hostport service has not been deleted in the Datapath. Manual Testing - 1. Add the GracefulNodeShutdown in the kubelet config on all nodes by modifying the configuration in `/var/lib/kubelet/config.yaml` ``` featureGates: GracefulNodeShutdown: true shutdownGracePeriod: 30s shutdownGracePeriodCriticalPods: 10s ``` 2. Run `sudo systemctl restart kubelet` on each node to apply the kubelet config change 3. Deploy an nginx web server with hostPort set, as well as a nodeSelector, so pods get scheduled on the same node after node restarts. ``` apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginx replicas: 1 template: metadata: labels: app: nginx spec: nodeSelector: kubernetes.io/hostname: <node-name> containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80 hostPort: 8081 ``` 4. Run `systemctl reboot` on the worker node to restart the machine. 5. After reboot spot the old pod in `Completed state`, while the new pod is `Running`. ``` $ kubectl get pods NAME READY STATUS RESTARTS AGE nginx-deployment-645797c867-8p2hp 0/1 Completed 0 13m nginx-deployment-645797c867-dx2m8 1/1 Running 0 4m2s ``` 6. `curl nodeIP:hostPort` successfully get the result. 7. Manually deleted the old pod which is in Completed state. ``` $ kubectl delete pod/nginx-deployment-645797c867-8p2hp ``` 8. Redo the `curl nodeIP:hostPort`, and successfully get the result again. // hostPort service has been preserved. Signed-off-by: Yash Shetty <yashshetty@google.com>
/test |
@yasz24 looks like the PR broke Go code precheck test on master, could you take a look and fix? Thanks https://github.com/cilium/cilium/actions/runs/5483316043/jobs/9989522587
|
This addresses a bug where deleting a completed pod would
delete the service map entry for the corresponding running pod on
the same node. This is due to the hostPort mapping keys being the
same for both old and new pods.
Ideally we should validate whether the (completed) pod being deleted
"owns" the host port service -- in order to prevent breaking host port
connectivity for any running pods with the same service as frontend.
This PR adds such a validation.
Testing-
Automated (Control Plane):
This fix is captured by a control plane test case that does the
following:
it as "Completed".
not been deleted in the Datapath.
Manual Testing -
modifying the configuration in
/var/lib/kubelet/config.yaml
sudo systemctl restart kubelet
on each node to apply the kubeletconfig change
nodeSelector, so pods get scheduled on the same node after node
restarts.
systemctl reboot
on the worker node to restart the machine.Completed state
, while the new podis
Running
.curl nodeIP:hostPort
successfully get the result.curl nodeIP:hostPort
, and successfully get the result again. // hostPort service has been preserved.Fixes: #22460
Signed-off-by: Yash Shetty yashshetty@google.com