Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate "ownership" of hostPort service being deleted #22587

Merged
merged 1 commit into from
Jul 7, 2023

Commits on Jun 28, 2023

  1. Validate "ownership" of hostPort service being deleted

    This addresses a bug where deleting a completed pod would wind up
    deleting the service map entry for the corresponding running pod on
    the same node, due to the hostPort mapping key being the same for
    the old and the new pod.
    
    Ideally we want to validate whether the completed pod "owns" the
    host port service, before deleting it, thus preventing breakage of
    host port connectivity for any running pods with the same service
    as frontend.
    
    This commit adds such a validation.
    
    Testing-
    Automated (Control Plane):
    This fix is captured by a control plane test case that does the
    following:
    1. Create a hostport pod, and terminate it's running containers to mark
       it as "Completed".
    2. Create another hostport pod using the same port as the "Completed" pod.
    3. Delete the "Completed" pod, and verify that the hostport service has
       not been deleted in the Datapath.
    
    Manual Testing -
    1. Add the GracefulNodeShutdown in the kubelet config on all nodes by
       modifying the configuration in `/var/lib/kubelet/config.yaml`
       ```
        featureGates:
          GracefulNodeShutdown: true
        shutdownGracePeriod: 30s
        shutdownGracePeriodCriticalPods: 10s
       ```
    2. Run `sudo systemctl restart kubelet` on each node to apply the kubelet
       config change
    3. Deploy an nginx web server with hostPort set, as well as a
       nodeSelector, so pods get scheduled on the same node after node
       restarts.
       ```
       apiVersion: apps/v1
       kind: Deployment
       metadata:
         name: nginx-deployment
       spec:
         selector:
           matchLabels:
             app: nginx
        replicas: 1
        template:
          metadata:
            labels:
              app: nginx
          spec:
            nodeSelector:
    	  kubernetes.io/hostname: <node-name>
    	containers:
    	- name: nginx
              image: nginx:1.14.2
              ports:
              - containerPort: 80
                hostPort: 8081
       ```
    4. Run `systemctl reboot` on the worker node to restart the machine.
    5. After reboot spot the old pod in `Completed state`, while the new pod
       is `Running`.
       ```
       $ kubectl get pods
       NAME                                READY   STATUS      RESTARTS    AGE
       nginx-deployment-645797c867-8p2hp   0/1     Completed   0           13m
       nginx-deployment-645797c867-dx2m8   1/1     Running     0           4m2s
       ```
    6. `curl nodeIP:hostPort` successfully get the result.
    7. Manually deleted the old pod which is in Completed state.
       ```
       $ kubectl delete pod/nginx-deployment-645797c867-8p2hp
       ```
    8. Redo the `curl nodeIP:hostPort`, and successfully get the result again.    // hostPort service has been preserved.
    
    Signed-off-by: Yash Shetty <yashshetty@google.com>
    yasz24 committed Jun 28, 2023
    Configuration menu
    Copy the full SHA
    c9aba72 View commit details
    Browse the repository at this point in the history