-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network not available first milliseconds of the pod #9454
Comments
Yes, we are well aware of this issue and working on a fix, but it'll take some time. The root problem is that sidecar and app start at the same time. There are few workarounds and ways to |
A work around until this functionality is implemented. Put the following into your .yaml:
|
You have two options
|
This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed in the next 30 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions. |
Please add the label to prevent closing, it's still an issue. |
Network specific problem - removing environments label. |
This is a duplicate of #11130. Definitely something we want to fix/are working on fixing, but lets keep track in a single issue to make things simpler |
Describe the bug
When I deploy Istio 1.0.2 on GKE, I notice that the networking is unavailable during the first millisecond or so of my main application container.
This is causing issues with some of the libraries I use. For example, if I import Stackdriver Tracing exporting library, it will try to make a network call to GCE metadata server (169.254.169.254) but since it doesn't work, the trace exporter code assumes I'm running outside GCE, so it will never export the metrics.
(For me, this is primarily an issue with Stackdriver client libraries for Tracing, Profiler, Debugger etc. but I assume it would impact other programs as well.)
I think this is happening because there's no ordering between istio-proxy container and my main application container.
Expected behavior
Network should be available when my program starts executing.
Steps to reproduce the bug
I don't have a minimal repro at the moment but I could spend a few hours to get one if the problem is not clear/unreproducible. We ended up adding retries to such network calls at the beginning of the programs in https://github.com/GoogleCloudPlatform/microservices-demo/ repository because of this.
This is noticable only in Go programs (probably because they start up super fast). Other services I have (Java, Python, C#) don't expose this problem because I assume it takes time to start up those programs and the first millisecond makes all the difference. That said I haven't tried C/C++/Rust etc but I assume it would happen in those languages with low startup overhead, too.
This problem doesn't happen on GKE without Istio. It only exhibits this behavior on istio and it's pretty reproducible (>50%). Sometimes I guess the OS/processes are non-deterministic and istio-proxy starts faster than my main process, so it ends up detecting it's on GCE and works correctly. But half of the time it doesn't.
Version
Kubernetes 1.9, Istio 1.0.2
Installation
Create vanilla GKE 1.9 cluster, apply istio-demo.yaml from release tarball.
Environment
Google Kubernetes Engine
Cluster state
Since I applied the generic istio-demo.yaml on an empty/stock cluster, omitting cluster dump for now. Let me know if it is needed.
The text was updated successfully, but these errors were encountered: