App container unable to connect to network before sidecar is fully running #11130
Comments
|
@esnible pls feel free to add things I missed. cc @GregHanson |
|
Currently I tell people to put the following into their .yaml: It would be better if networking was ready when the app container started. A novel approach would be to slow down the app container until networking was available. A hook could set the CPU for containers other than the sidecar to use Another idea is to have the init container include pilot-agent and fetch /etc/istio |
|
This may be a duplicate of #9454 |
|
This may be a duplicate of #4341 |
|
Hey, we here at Monzo have open sourced our solution to this sequencing problem: |
|
Removing myself because I am not a sidecar networking guru. That is what we need for this item. |
|
Long term fix is #11366 or maybe kubernetes/kubernetes#65502 |
|
As we have |
|
Consider config postStart for app container to check envoy status. such as: |
|
I think this makes sense. |
|
esnible's solution worked for us for a long period. Unfortunately the issue starts to occur again, and even worse. Could anyone shed light on this? |
|
This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed in the next 30 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions. |
|
It's been a year that this issue (or related issues) have been opened. It would be nice not to have the deployments know that they need to wait for the mesh' sidecar to be ready. That link should not exist and waiting for the sidecar is becoming a best practice and its a known common problem that leads to weak UX and onboarding of new users. |
|
FYI - this may be useful for those out there running Istio older than v1.7 which has |
|
If the pod/app is very sensible to network connectivity at startup, I would recommend adding the following annotation to the application - as @AntonySmirnoff stated. annotations:
proxy.istio.io/config: '{ "holdApplicationUntilProxyStarts": true }'It would ensure that proxy is started up and functioning before starting the application. It solved the problem for me. |
|
+1 |
|
Yes, this should be closed now I think! :) |
|
I don't think this is done, it still doesn't support init containers or use stable APIs. Can we keep this open to track the long term solution? |
|
This maybe a FAQ but how does |
|
Containers do *start* sequentially - usually starting containers is nearly
instant though so it looks parallel. That issue you linked is about waiting
until previous ones are *ready*, not started.
The trick, which is really a hack relying on implementation details, is a
container is not considered "started" until the preStart hook (or maybe it
was postStart, I forget :-)) is finished. So the hook waits for envoy to be
ready.
…On Wed, Apr 21, 2021 at 3:17 PM Laurent Demailly ***@***.***> wrote:
This maybe a FAQ but how does holdApplicationUntilProxyStarts work?
looking at the code it seems what it does is put the envoy proxy container
first in the pod (instead of last); how is that holding the application?
afaik containers aren't guaranteed to start sequentially (
kubernetes/kubernetes#65502
<kubernetes/kubernetes#65502> is till open) or
am I missing something? am I missing some probe interactions that achieve
the goal of starting the app only after envoy is fully up and ready?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#11130 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEYGXMXUY4UPYYR4JN3SNDTJ5FHVANCNFSM4GROCGSQ>
.
|
|
Thanks a lot @howardjohn, so if I understand correctly, istio-proxy pod does have a hook such as the readiness is disguised as an am-I-start'ed hook (I didn't know about that hack that k8s was waiting for some feedback at all before starting the next pod; so I thought this was just changing/improving the race condition) - good to know it can be relied upon! Any known downside ? (out of the kubectl exec thing which anyway seems to ask for a pod) - If no downside why isn't it the default? (I guess some app may be doing a lot of internal maybe disk based work at startup and thus don't need to wait for network/can benefit from parallel execution without waiting for pilot/envoy but that seems to be they could turn that on; most apps/services probably need to contact other systems during startup) |
|
|
|
it's not just in cluster it's all networking and it helps adoption to not have gotchas |
Describe the feature request
We had users who spend very long time to debug why their app container stops working initially when sidecar is used in istio. They have found out the app container could not reach out to network for simple things like clone a file from GitHub before the envoy proxy is ready and running. It is hard to debug this because when they exec into the container after the deployment is running, everything works fine.
Describe alternatives you've considered
the current work group used by folks is to put a big sleep like 20 or 30 seconds in their app container to give enough time for envoy to start up.
This is fine once they discover the issue and understand how istio works better, but it can take days for them to discover the issue.
How can we make the experience better?
can we provide some startup hook so app container won't start till envoy sidecar is ready, if the app container starts very fast and requires network connectivity.
The text was updated successfully, but these errors were encountered: