Skip to content

Metallb should use Endpointslices serving property, not ready #2074

@zerkms

Description

@zerkms

MetalLB Version

0.13.9

Deployment method

Manifests

Main CNI

kube-router

Kubernetes Version

1.26.4

Cluster Distribution

No response

Describe the bug

UPD: all this mostly relevant to ExternalTrafficPolicy: Cluster services, but in my opinion all eps.ready uses should be suppressed.

At the moment throughout the code

for _, slice := range eps.SlicesVal {
for _, ep := range slice.Endpoints {
if !epslices.IsConditionReady(ep.Conditions) {
continue
}

the "readiness" of an endpoint slice is determined by the ready property.

Now let's have a look at the EPS conditions documentation:

FIELDS:
   ready	<boolean>
     ready indicates that this endpoint is prepared to receive traffic,
     according to whatever system is managing the endpoint. A nil value
     indicates an unknown state. In most cases consumers should interpret this
     unknown state as ready. For compatibility reasons, ready should never be
     "true" for terminating endpoints.

   serving	<boolean>
     serving is identical to ready except that it is set regardless of the
     terminating state of endpoints. This condition should be set to true for a
     ready endpoint that is terminating. If nil, consumers should defer to the
     ready condition.

   terminating	<boolean>
     terminating indicates that this endpoint is terminating. A nil value
     indicates an unknown state. Consumers should interpret this unknown state
     to mean that the endpoint is not terminating.

So, as you can see the better suited field is in fact serving.

Why it's important: with the current implementation it's impossible to implement a service that implement any graceful shutdown strategy whatsoever. As long as the pod becomes in terminating state - its eps.ready condition becomes False.

If eps.serving was used instead - then on pod shutdown it had chance to stay running and complete serving clients gracefully

To Reproduce

  1. Create a pod
  2. Expose it via a service (with externalTrafficPolicy: Local)
  3. Describe the endpointslice for the service and see the status of all 3 condition fields.
  4. Now delete the pod
  5. Quickly describe the endpointslice again

Expected Behavior

Metallb should not immediately stop announcing the pod that is currently being terminated, but instead refer to the serving field.

So the flow should look like this:

  1. The pod receives delete request. CRI sends signal to the process to shutdown. At this point Pod is still in Ready state
  2. Endpointslice.ready=False, Endpointslice.serving=True
  3. metallb continues announcing the node ip with the pod
  4. Process completes the graceful shutdown ceremony and quits, endpoint is entirely removed
  5. metallb stops announcing the node ip as the endpoint does not exist anymore

With the current implementation it all breaks on the step 3.

Additional Context

N/a

I've read and agree with the following

  • I've checked all open and closed issues and my request is not there.
  • I've checked all open and closed pull requests and my request is not there.

I've read and agree with the following

  • I've checked all open and closed issues and my issue is not there.
  • This bug is reproducible when deploying MetalLB from the main branch
  • I have read the troubleshooting guide and I am still not able to make it work
  • I checked the logs and MetalLB is not discarding the configuration as not valid
  • I enabled the debug logs, collected the information required from the cluster using the collect script and will attach them to the issue
  • I will provide the definition of my service and the related endpoint slices and attach them to this issue

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions