New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metallb should use Endpointslices serving
property, not ready
#2074
Labels
Comments
Please note that this behaviour only affects services with ExternalTrafficPolicy: Local |
@leppeK that's true, I implied it, sorry. Now added it explicitly to the repro steps. Also thanks to the slack user |
Yup, makes sense! Thanks for rasinig this |
zerkms
pushed a commit
to zerkms/metallb
that referenced
this issue
Sep 20, 2023
It was an easy fix, hopefully I didn't miss anything :-) |
zerkms
added a commit
to zerkms/metallb
that referenced
this issue
Sep 20, 2023
Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>
zerkms
added a commit
to zerkms/metallb
that referenced
this issue
Sep 21, 2023
Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>
zerkms
added a commit
to zerkms/metallb
that referenced
this issue
Sep 21, 2023
It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connection to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>
zerkms
added a commit
to zerkms/metallb
that referenced
this issue
Sep 21, 2023
It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connections to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>
github-merge-queue bot
pushed a commit
that referenced
this issue
Sep 26, 2023
It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connections to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed #2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>
fedepaol
pushed a commit
to fedepaol/metallb
that referenced
this issue
Oct 19, 2023
It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connections to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>
novad03
pushed a commit
to novad03/k8s-meta
that referenced
this issue
Nov 25, 2023
It is necessary because `.ready` flag is set to `False` immediately during shutdown of the pod, while pod is still alive. And `.serving` flag is `True` while pod is healthy (even while shutting down). So this change unlocks the ability to implement graceful shutdown for pods. Sample scenario: 1. Pod is healthy and running, then it receives a shutdown signal (eg: pod is just deleted, or the node is drained) 2. Pod handles the kill signal and starts gracefully shutting down. At this state `.ready = False, .serving = True` 3. With new implementation - because `.serving == True` the pod's IP is still announced, which allows traffic for already established connections to freely flow to it 4. Then the pod completes its graceful shutdown ceremony and quits. At this point endpoint is removed from the endpointslice, and the IP is removed from announces. Fixed metallb/metallb#2074 Signed-off-by: Ivan Kurnosov <zerkms@zerkms.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
MetalLB Version
0.13.9
Deployment method
Manifests
Main CNI
kube-router
Kubernetes Version
1.26.4
Cluster Distribution
No response
Describe the bug
UPD: all this mostly relevant to
ExternalTrafficPolicy: Cluster
services, but in my opinion alleps.ready
uses should be suppressed.At the moment throughout the code
metallb/speaker/layer2_controller.go
Lines 67 to 71 in a5f74ed
the "readiness" of an endpoint slice is determined by the
ready
property.Now let's have a look at the EPS conditions documentation:
So, as you can see the better suited field is in fact
serving
.Why it's important: with the current implementation it's impossible to implement a service that implement any graceful shutdown strategy whatsoever. As long as the pod becomes in terminating state - its
eps.ready
condition becomesFalse
.If
eps.serving
was used instead - then on pod shutdown it had chance to stay running and complete serving clients gracefullyTo Reproduce
externalTrafficPolicy: Local
)endpointslice
for the service and see the status of all 3 condition fields.endpointslice
againExpected Behavior
Metallb should not immediately stop announcing the pod that is currently being terminated, but instead refer to the
serving
field.So the flow should look like this:
With the current implementation it all breaks on the step 3.
Additional Context
N/a
I've read and agree with the following
I've read and agree with the following
The text was updated successfully, but these errors were encountered: