-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ALB sending requests to pods after ingress controller deregisters them leading to 504s #1064
Comments
|
I should add that I've implemented a preStop delay on my pod to allow for deregistration to occur prior to pod shut down. This seems to be functioning correctly. I can see the deregistration in the controller logs, and several seconds later I see a SIGTERM on my pod. The issue seems to be between ALB and the ingress controller. |
|
I'm definitely experiencing the same thing. Powershell preStop commands don't seem to like sleeping first either so had to really rig that up. Is there any insight as to why terminated pods still get sent requests? Happens if the type is IP or instance. |
|
@jorihardman is this happening every single time you update a deployment? Or only sometimes. |
|
@nicholasgcoles To be honest when I was investigating this issue, my sample size was only about 5 deployments, each rolling 4 pods. I saw at least 1 504 for each of these. I didn't have constant traffic for each deploy, so honestly, it wasn't a perfect experimental setup. Still, the logs seemed to me to be a smoking gun for what I would expect to be an impossible situation, so I figured it was worth raising the question. Happy to investigate further with some guidance. |
|
This looks similar to #905 |
|
We are able to reproduce this 100% of time with target-type as |
|
Checking back in here. I think I was able to resolve this issue with the Couple things I changed from my initial config: # Extend the pods shutdown grace period from the default of 30s to 60s.
# This goes in the pod template spec.
terminationGracePeriodSeconds: 60
# Increase the sleep before SIGTERM to 25s. I had this as 5s previously and it wasn't enough.
lifecycle:
preStop:
exec:
command: ["sleep", "25"]Extending the sleep time allows ALB to send a few requests even after the dereregistration time without the pod rejecting them. My pod needs max 30s to gracefully answer all requests and terminate, so the 25s sleep + 30s = 55s. The extended I used https://github.com/JoeDog/siege to send constant concurrent requests to the load balancer while deploying and achieved 100% availability. |
|
I'm going to close this issue as it seems to be an AWS problem and not related to the controller. The workaround is to add sufficient preStop sleep to allow ALB to complete deregistration. I did 25s which might be overkill but seemed to mitigate the problem. |
|
Previous testing was with our staging environment. When I shipped the changes to production, I still see 504s from ALB sending requests to terminating pods. |
|
Agreed. I'm running windows containers but all of the theory is the same. Actually using a 60s sleep at this point and still getting 504s from the ALB during rollout. |
|
Strangely, I haven't gotten a 504 during deploys for the last 48 hours and I've done 3 or 4 deploys at peak traffic times. I did tweak the readiness checks on my pod to make them ready quicker (lowered |
|
Naturally, as soon as I signal success, I see another spurt of 502s. There's definitely still a deregistration race condition here. This time I got 502s due to a pod terminating prior to being deregistered. This seems to be pretty non-deterministic at this point. Sometimes alb-ingress-controller keeps up with changing targets, and sometimes it doesn't. |
|
Any ETA on this? |
|
For those who use the setup:
...here's an example of how to avoid the problem: linkerd/linkerd2#3747 (comment) I am not an expert in ALB internals but I feel it happens because of cross-region nature (and, maybe, multi-tenant). It takes time to propagate changes to all instances, so ALB will send some traffic for a while even after a target is requested to be deregister. So the overall "sleep" configuration for preStop hook should be calculated based on:
|
|
Have you guys tried After disabling waf, longest running process was down to max of 5 sec (still needs sleep of 10 sec in I have verified that the issue exists in |
|
@aditya-pr , yes, and it is not enough to get the smooth rolling update. |
|
@aditya-pr Amazing dashboard you've got there. Wish I had that when I was digging into this issue. To clarify, are you saying that |
Yes, there is lag even after disabling waf. There are other aws api calls which are still being retried. Adding all endpoints of an ingress to queue whenever any endpoint updates must be making this worse. Deregistration delay of even 5 seconds means elb is sending to a terminated pod, ie 504 Gateway Timeout error. |
|
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
|
/remove-lifecycle stale |
|
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
|
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Having the same issue here, I have a prestop hook of 60s sleep, but I still get 504 from ALB after deleting a pod manually or scaled down the deployment, the timeout requests are always received 8s after the deregistration started, and are timed out 10s after that so 18s after the deregistration started. At this time, the pod still works and reply correctly to requests that I send manually to it directly... Anyone here have found a solution to this ? |
|
@Elyahou I don't feel too bad because this is an ongoing issue of over a year at this point but I moved on to kong-ingress-controller and have been very happy with it. The things that ALB handled nicely - for me primarily was the AWS certs - have now all been handled by adding a couple more pieces, namely cert-manager and external-dns so that I no longer worry about the ALB lifecycle conflicts. ALB ingress seemed like an awesome idea and was dead-simple to use, but the inability to sync up the lifecycles b/t pod and ALB delivery just became too much. |
|
Any update on this? This should be reopened given the comments here, IMO. |
|
Is it possible that lb controller removes the security group entries despite target is still deregistering? |
|
I think I might be currently encountering this issue. I'll have to do further tests to verify, but it seems like this issue should be reopened. |
|
Still an issue btw, for those who come here after me, the solution AWS suggests here is to use healthchecks looks like and do not rely on target deregistration https://aws.amazon.com/blogs/containers/how-to-rapidly-scale-your-application-with-alb-on-eks-without-losing-traffic/ |
|
@123BLiN According to the sample from aws, pre-stop hook ist still required. I am curious if the "health-check + sleep xx" hook has any benefit over a simple "sleep xx" hook. According to my tests, there is no difference. |
|
This is an issue for us too. A BIG one. ...It's incredible to me how this has been allowed to go this long without concrete fix or resolution 😂 . Talk about "Bias for Action" and "Ownership" 🎃 AWS philosophizes and writes literature on the 6 Well-Architected Pillars. Also releases a Load Balancer Ingress Controller that regularly throws ...And then lets the issue just kinda linger and languish for years...and then tells the customer that the "fix" is just a bunch of workarounds that impact the customer's kubernetes cluster in other ways...like delayed pod termination, which then impacts the scale-speed and cost characteristics of the cluster. 👍🏼 |
I guess this isn't health-check but setting I believe that remaining 50x errors are caused because of this, as he wrote in the sentenced before Conclusion chapter. |
|
We found a useful pattern for this, although it's fairly bespoke (using s6-overlay). I'm happy to provide a sample implementation once I test it a little while longer. Anyone on this thread using s6-overlay, or at least willing to? :) +1 so I know whether I should waste my breath or not. |
could you share more details? |
|
/remove-lifecycle rotten |
|
/reopen |
|
@leonardocaylent: You can't reopen an issue/PR unless you authored it or you are a collaborator. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Why was this issue mark as closed if this is still happening after 5 years? |
|
It definitely continues to happen. The posted blog/article regarding how to do this uses an example where you use We worked around it a different way, but it required non-trivial effort and a change to the way we fundamentally configure and manage pods and the application(s) inside them. |
Hi @armenr , Is the article you're talking about this one? My concern is that this is an AWS ALB issue that should be in "Open" status not closed. I saw some workarounds are moving to NGINX ingress class just because of this issue which doesn't seem like a real solution, it's just a change of controller. On the article that I shared they use 2 different changes applied together: 2)Usage of pod readinessGate injection You mentioned that you managed it differently from the application, can you share some details if possible? (i.e. some NestJS apps need to use app.enableShutdownHooks() in the code, some others change the sleep 15 to sleep 60) |
|
What we do is the following:
For us, resource budgeting and management (limits on mem/cpu) is imperative, so shoveling in additional crap like sidecars is just a headache to have to worry about. As such, a "multi-process" container with s6-overlay was (contrary to our skepticism and resistance toward it) a really lightweight and useful solution. |


I have the following ingress controller defined with the relevant bits being that it is target-type ip with a 30 second deregistration_delay.
When I delete a pod, either manually or as part of a rolling deploy, I see 504s returned from ALB. 504s are returned when ALB cannot form a connection to its target within 10s. Here is one such message from the ALB logs:
There's a lot going on there, but the important part is that the request is received at 2019-11-06T01:35:29.436000Z and the error is emitted 10s later at 2019-11-06T01:35:39.438256Z.
I investigated the ingress controller logs and I can see that the pod in question, 172.18.158.197:8080, is deregistered at 2019-11-06T01:35:25.891546Z 4 seconds prior to when the above request is received.
My understanding is that once a target is set to "deregistering" ALB will not forward it anymore requests. It's unclear to me how this requests seems to be breaking that rule - any thoughts?
The text was updated successfully, but these errors were encountered: