-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Application experiences connection refused for all outbound requests during startup #2704
Comments
Thanks @dwj300 . We're hopeful that Kubernetes will provide a better primitive so that there's a principled way to achieve this, but I have a workaround you can try in the meantime: A few weeks ago, I wrote a little tool that you can use to prevent your application from running until the proxy is ready: https://github.com/olix0r/linkerd-await I've tested it for myself and it seems to work well, but since this needs to be added to your application container images, it might not be an ideal solution... Ideally, we'd improve this to only block on readiness when linkerd is injected, but we don't yet add environment variables to non-proxy containers. Let us know if this workaround is feasible for you and if you think it would be useful to pull into the project more officially. |
Also noticing this issue, it's not urgent but would make the deployment cleaner. Currently my container will boot, fail to connect to a rabbitmq server and then restart. The second attempt usually succeeds. |
@olix0r sorry for the slow delay, finally got around to trying this - works great! Maybe just add a snippet to the Readme about recommended uses in kubernetes. For us, we now build our images with
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
Hi @olix0r, I was wondering if there is any update on this. We experience the same reset connection errors in our apps. I was reviewing the workaround, but I really do not want to put the binary in all of our container images. |
@poochwashere unfortunately, there's not much Linkerd can do here. The options are:
|
Would it not be possible for the |
When a pod is launched, Init-containers like |
Would it be possible to add a post-start lifecycle hook to the linkerd-proxy container that blocks until the proxy is ready to serve requests? |
@Tolsto The issue is that we need to block application containers from starting until the proxy has started. I don't think post-start hooks are can do this. Per the docs:
So, even if we modified application containers with a hook that depends on the proxy, I don't think this could reliably block the container from starting. That said, if you find something that works, please share an example! |
@olix0r I didn't mean to use post-start hooks for the application containers but only for the linkerd-proxy container. Kubernetes starts containers sequentially and a post-start hook will block that process until the post-start hook has completed. This post here describes the idea: https://medium.com/@marko.luksa/delaying-application-start-until-sidecar-is-ready-2ec2d21a7b74 The relevant code is: https://github.com/kubernetes/kubernetes/blob/537a602195efdc04cdf2cb0368792afad082d9fd/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L827-L830 and |
Bug Report
What is the issue?
(More or a question, or a request to improve docs): During the initialization of our application, we make a handful of HTTP calls to external services. While the proxy is initializing (acquiring its cert, etc), these calls all receive a connection refused error. While we do have retries in place, we do run out of attempts and then fatally fail. While we can simply can add more retries, we were wondering what the recommended pattern is for waiting for the proxy to be ready before making these calls. Ideally, we could instruct kubernetes to not start our application container until the proxy gives the OK, but I don't believe that is supported today. Should we probe the pod's health endpoint until it is ready? Is there a better way to do this?
How can it be reproduced?
Deploy a pod that makes some HTTP outbound calls (in our case, to the open internet).
Logs, error output, etc
Get https://<URL>: dial tcp <IP>:<PORT>: connect: connection refused
linkerd check
outputEnvironment
Possible solution
Additional context
The text was updated successfully, but these errors were encountered: