Metallb should use Endpointslices `serving` property, not `ready` #2074

zerkms · 2023-09-12T02:55:03Z

MetalLB Version

0.13.9

Deployment method

Manifests

Main CNI

kube-router

Kubernetes Version

1.26.4

Cluster Distribution

No response

Describe the bug

UPD: all this mostly relevant to ExternalTrafficPolicy: Cluster services, but in my opinion all eps.ready uses should be suppressed.

At the moment throughout the code

metallb/speaker/layer2_controller.go

Lines 67 to 71 in a5f74ed

    
           for _, slice := range eps.SlicesVal { 
        
           	for _, ep := range slice.Endpoints { 
        
           		if !epslices.IsConditionReady(ep.Conditions) { 
        
           			continue 
        
           		}

the "readiness" of an endpoint slice is determined by the ready property.

Now let's have a look at the EPS conditions documentation:

FIELDS:
   ready	<boolean>
     ready indicates that this endpoint is prepared to receive traffic,
     according to whatever system is managing the endpoint. A nil value
     indicates an unknown state. In most cases consumers should interpret this
     unknown state as ready. For compatibility reasons, ready should never be
     "true" for terminating endpoints.

   serving	<boolean>
     serving is identical to ready except that it is set regardless of the
     terminating state of endpoints. This condition should be set to true for a
     ready endpoint that is terminating. If nil, consumers should defer to the
     ready condition.

   terminating	<boolean>
     terminating indicates that this endpoint is terminating. A nil value
     indicates an unknown state. Consumers should interpret this unknown state
     to mean that the endpoint is not terminating.

So, as you can see the better suited field is in fact serving.

Why it's important: with the current implementation it's impossible to implement a service that implement any graceful shutdown strategy whatsoever. As long as the pod becomes in terminating state - its eps.ready condition becomes False.

If eps.serving was used instead - then on pod shutdown it had chance to stay running and complete serving clients gracefully

To Reproduce

Create a pod
Expose it via a service (with externalTrafficPolicy: Local)
Describe the endpointslice for the service and see the status of all 3 condition fields.
Now delete the pod
Quickly describe the endpointslice again

Expected Behavior

Metallb should not immediately stop announcing the pod that is currently being terminated, but instead refer to the serving field.

So the flow should look like this:

The pod receives delete request. CRI sends signal to the process to shutdown. At this point Pod is still in Ready state
Endpointslice.ready=False, Endpointslice.serving=True
metallb continues announcing the node ip with the pod
Process completes the graceful shutdown ceremony and quits, endpoint is entirely removed
metallb stops announcing the node ip as the endpoint does not exist anymore

With the current implementation it all breaks on the step 3.

Additional Context

N/a

I've read and agree with the following

I've checked all open and closed issues and my request is not there.
I've checked all open and closed pull requests and my request is not there.

I've read and agree with the following

I've checked all open and closed issues and my issue is not there.
This bug is reproducible when deploying MetalLB from the main branch
I have read the troubleshooting guide and I am still not able to make it work
I checked the logs and MetalLB is not discarding the configuration as not valid
I enabled the debug logs, collected the information required from the cluster using the collect script and will attach them to the issue
I will provide the definition of my service and the related endpoint slices and attach them to this issue

The text was updated successfully, but these errors were encountered:

leppeK · 2023-09-12T05:27:23Z

Please note that this behaviour only affects services with ExternalTrafficPolicy: Local

zerkms · 2023-09-12T05:51:36Z

@leppeK that's true, I implied it, sorry. Now added it explicitly to the repro steps.

Also thanks to the slack user Merijn Keppel who mentioned the same.

fedepaol · 2023-09-19T07:23:32Z

Yup, makes sense! Thanks for rasinig this

Fixes metallb#2074

zerkms · 2023-09-20T01:28:59Z

It was an easy fix, hopefully I didn't miss anything :-)

Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>

It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connection to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>

It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connections to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>

It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connections to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed #2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>

It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connections to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>

It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connections to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed metallb/metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>

zerkms added the bug label Sep 12, 2023

fedepaol added help wanted good first issue labels Sep 19, 2023

zerkms pushed a commit to zerkms/metallb that referenced this issue Sep 20, 2023

Switched from eps.ready to eps.serving

910281b

Fixes metallb#2074

zerkms mentioned this issue Sep 20, 2023

Switched from eps.ready to eps.serving #2088

Merged

zerkms added a commit to zerkms/metallb that referenced this issue Sep 20, 2023

Switched from eps.ready to eps.serving

ffd7eff

Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>

zerkms added a commit to zerkms/metallb that referenced this issue Sep 21, 2023

Switched from eps.ready to eps.serving

928b7f9

Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>

fedepaol closed this as completed in #2088 Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metallb should use Endpointslices `serving` property, not `ready` #2074

Metallb should use Endpointslices `serving` property, not `ready` #2074

zerkms commented Sep 12, 2023 •

edited

leppeK commented Sep 12, 2023

zerkms commented Sep 12, 2023

fedepaol commented Sep 19, 2023

zerkms commented Sep 20, 2023

Metallb should use Endpointslices serving property, not ready #2074

Metallb should use Endpointslices serving property, not ready #2074

Comments

zerkms commented Sep 12, 2023 • edited

MetalLB Version

Deployment method

Main CNI

Kubernetes Version

Cluster Distribution

Describe the bug

To Reproduce

Expected Behavior

Additional Context

I've read and agree with the following

I've read and agree with the following

leppeK commented Sep 12, 2023

zerkms commented Sep 12, 2023

fedepaol commented Sep 19, 2023

zerkms commented Sep 20, 2023

Metallb should use Endpointslices `serving` property, not `ready` #2074

Metallb should use Endpointslices `serving` property, not `ready` #2074

zerkms commented Sep 12, 2023 •

edited