Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive pod startup times #1297

Closed
jldec opened this issue Jun 20, 2018 · 26 comments
Closed

Excessive pod startup times #1297

jldec opened this issue Jun 20, 2018 · 26 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@jldec
Copy link

jldec commented Jun 20, 2018

Expected Behavior

Revisions scaled to zero should take under a second to start up and serve traffic
(assuming images are cached)

Actual Behavior

Observed startup times for the helloworld sample appear to be >10s

Steps to Reproduce the Problem

  1. run helloworld, confirm that it handles requests, and then wait for scale to 0
  2. send a request and measure response time

Additional Info

I'm running a development build on localhost in docker for Mac - expect performance of release builds on GKE to be different.

Specific issues

@markusthoemmes
Copy link
Contributor

Any idea where the time is spent? I've been looking into the pod startup latency, which might take a fair share of the observed latency.

@jldec
Copy link
Author

jldec commented Jun 21, 2018

No specifics other than the ideas (Istio init container and sidecar injection) as discussed in slack. It would be great to get some metrics to dig into this. Cold-start times are a concern for FaaS use cases.

/assign @georgeharley

@josephburnett
Copy link
Contributor

josephburnett commented Jun 21, 2018

@nikkithurmond and I have been deep diving on this. The pod gets scheduled and put into the Endpoint as unready within a second. Surprisingly, it takes a long time for the readiness checks to go healthy (6 seconds--3 retries at 2 second intervals). The app under test starts in 600ms, so the latency is somewhere in 1) starting docker containers 2) initialization of envoy or 3) docker networking of the containers together into the pod.

@josephburnett
Copy link
Contributor

Another part of the latency is 2-5 seconds for iptable programming. Once the pod goes healthy, the endpoint updates the ready set, and the kubeproxy notices the change. Then kubeproxy programs iptables with the pod's IP address behind the service.

We will also be digging into this time interval to figure out where we're spending our time. Is it in batch updates from the kubelet? Is it in the actual iptables programming?

@josephburnett
Copy link
Contributor

josephburnett commented Jun 21, 2018

@nikkithurmond, are any of your results in a state where they would be useful to share? It would be nice to get a second opinion on our findings. And maybe to divide the work.

@thockin
Copy link

thockin commented Jun 21, 2018

The iptables programming should happen very quickly on average (some burts is allowed), but is rate-limited and batched if the burst rate is exceeded, so if there's a lot of updates happening it could hit the batch backoff, which I think is 10 seconds, so 5 on average makes sense.

@thockin
Copy link

thockin commented Jun 21, 2018

This could be fuel to the fire for changes in kube-proxy or even recommending IPVS mode rather than iptables mode (which should program faster)

@jldec
Copy link
Author

jldec commented Jun 21, 2018

@thockin thanks for the info - do you know if the iptables programming backoff/retry wait time is configurable? For some use cases (e.g. user-facing functions) a 10s wait will feel like a denial of service.

@nikkithurmond
Copy link
Contributor

https://docs.google.com/spreadsheets/d/1QfhSPvNu_LXzTx3cMjLUkpdr1UGT7krUAKbmU1WLPO4/edit?usp=sharing

Here's a detailed timeline of one single call to a scaled-to-zero revision

@josephburnett
Copy link
Contributor

Thanks @nikkithurmond, this is really great data!

@evankanderson
Copy link
Member

/kind bug

@google-prow-robot google-prow-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 22, 2018
@jeremyeder
Copy link

@nikkithurmond @josephburnett hey - can you provide me access to that spreadsheet? jeder@redhat.com.

And also, can you share any tooling used to generate the timeline?

What was the environment setup? (hardware/software versions)

@glyn
Copy link
Contributor

glyn commented Jun 26, 2018

This seems related to the more specific bug 1345 "Envoy adds 5-6 seconds to pod startup".

@josephburnett
Copy link
Contributor

@jeremyeder, done. And I shared with elafros-dev@googlegroups.com as well.

@glyn, #1345 is part of this. But there are other issues well, such as the 3 seconds delay propagating the ready pod's IP address to all nodes. We'll cut a separate issue for that specifically as we get more details.

How about we keep this issue for discussing the overall problem?

@glyn
Copy link
Contributor

glyn commented Jun 27, 2018

@josephburnett

How about we keep this issue for discussing the overall problem?

+1. Let's post links to the more specific issue here so we can keep track of what's going on.

@josephburnett
Copy link
Contributor

@glyn, rodger. I've started a list in the top-level issue description.

@georgeharley
Copy link

Analysis of istio-proxy container logs taken from a number of cold start requests reveal a steady 5 seconds pause during its start up phase. During those 5 seconds all Envoy initialization activities appear to wait before eventually resuming in order to reach the state where the container is able to proxy pod traffic requests. The duration of the pause is hard coded.

In a little more detail, when the Envoy GrpcMuxImpl is started it tries to open a new gRPC bidi stream to request resource discovery details. That initial invocation of GrpcMuxImpl::establishNewStream() fails. Part of the error handling response to that is to start a retry timer which will, when it expires, make a further call to GrpcMuxImpl::establishNewStream(). The duration of that timer is fixed to be 5 seconds (the actual constant used is RETRY_DELAY_MS as defined in grpc_mux_impl.h. The retry timer actually gets called twice in quick succession around about this time (within a millisecond or two) because the GrpcMuxImpl::onRemoteClose() callback method gets called by another actor in the system. Apparently that just results in the timer being reset back to the start again so it doesn’t really make much difference to this story.
Until a call to GrpcMuxImpl::establishNewStream() succeeds then no call to GrpcMuxImpl::sendDiscoveryRequest() is capable of succeeding.

Below is a typical snippet of log from an istio-proxy starting up alongside an application container. Look closely near the bottom and you will see the 5 seconds gap between the line announcing “No stream available to sendDiscoveryRequest for type.googleapis.com/envoy.api.v2.Cluster” and the next line that says “Establishing new gRPC bidi stream for rpc StreamAggregatedResources(stream .envoy.api.v2.DiscoveryRequest) returns (stream .envoy.api.v2.DiscoveryResponse);”. That “Establishing new gRPC bidi stream…” line is written by the 5 second retry timer’s call to GrpcMuxImpl::establishNewStream(). The timer was started earlier on in the trace (twice, as stated above) around the time of the entries “gRPC config stream closed: 14, no healthy upstream” and “Unable to establish new stream”.

. . .

[2018-06-28 09:54:45.372][19][debug][upstream] external/envoy/source/common/upstream/cluster_manager_impl.cc:903] no healthy host for HTTP connection pool
[2018-06-28 09:54:45.375][19][debug][http] external/envoy/source/common/http/async_client_impl.cc:92] async http request response headers (end_stream=true):
':status', '200'
'content-type', 'application/grpc'
'grpc-status', '14'
'grpc-message', 'no healthy upstream'

[2018-06-28 09:54:45.375][19][warning][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:235] gRPC config stream closed: 14, no healthy upstream
[2018-06-28 09:54:45.375][19][trace][main] external/envoy/source/common/event/dispatcher_impl.cc:126] item added to deferred deletion list (size=1)
[2018-06-28 09:54:45.375][19][warning][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:39] Unable to establish new stream
[2018-06-28 09:54:45.375][19][info][config] external/envoy/source/server/configuration_impl.cc:53] loading 0 listener(s)
[2018-06-28 09:54:45.375][19][info][config] external/envoy/source/server/configuration_impl.cc:87] loading tracing configuration
[2018-06-28 09:54:45.375][19][info][config] external/envoy/source/server/configuration_impl.cc:96]   loading tracing driver: envoy.zipkin
[2018-06-28 09:54:45.375][19][info][config] external/envoy/source/server/configuration_impl.cc:109] loading stats sink configuration
[2018-06-28 09:54:45.375][19][debug][config] external/envoy/source/extensions/stat_sinks/statsd/config.cc:26] statsd UDP ip address: 10.109.226.144:9125
[2018-06-28 09:54:45.376][19][info][main] external/envoy/source/server/server.cc:396] starting main dispatch loop
[2018-06-28 09:54:45.376][19][trace][main] external/envoy/source/common/event/dispatcher_impl.cc:52] clearing deferred deletion list (size=1)
[2018-06-28 09:54:45.376][19][debug][upstream] external/envoy/source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 3124 milliseconds
[2018-06-28 09:54:45.377][19][debug][upstream] external/envoy/source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 3124 milliseconds
[2018-06-28 09:54:45.377][19][debug][upstream] external/envoy/source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 3436 milliseconds
[2018-06-28 09:54:45.377][19][debug][upstream] external/envoy/source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 3436 milliseconds
[2018-06-28 09:54:45.378][19][debug][upstream] external/envoy/source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 3124 milliseconds
[2018-06-28 09:54:45.378][19][debug][upstream] external/envoy/source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 3436 milliseconds
[2018-06-28 09:54:45.378][19][debug][upstream] external/envoy/source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 3436 milliseconds
[2018-06-28 09:54:45.378][19][debug][upstream] external/envoy/source/common/upstream/upstream_impl.cc:853] async DNS resolution complete for istio-pilot.istio-system
[2018-06-28 09:54:45.378][19][debug][upstream] external/envoy/source/common/upstream/upstream_impl.cc:872] DNS hosts have changed for istio-pilot.istio-system
[2018-06-28 09:54:45.378][19][debug][upstream] external/envoy/source/common/upstream/cluster_manager_impl.cc:767] membership update for TLS cluster xds-grpc
[2018-06-28 09:54:45.378][19][debug][upstream] external/envoy/source/common/upstream/cluster_manager_impl.cc:88] cm init: init complete: cluster=xds-grpc primary=1 secondary=0
[2018-06-28 09:54:45.378][19][debug][upstream] external/envoy/source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 3436 milliseconds
[2018-06-28 09:54:45.379][19][debug][upstream] external/envoy/source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 3749 milliseconds
[2018-06-28 09:54:45.379][19][debug][upstream] external/envoy/source/common/upstream/upstream_impl.cc:853] async DNS resolution complete for zipkin.istio-system
[2018-06-28 09:54:45.379][19][debug][upstream] external/envoy/source/common/upstream/cluster_manager_impl.cc:88] cm init: init complete: cluster=zipkin primary=0 secondary=0
[2018-06-28 09:54:45.379][19][info][upstream] external/envoy/source/common/upstream/cluster_manager_impl.cc:128] cm init: initializing cds
[2018-06-28 09:54:45.379][19][debug][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:103] gRPC mux subscribe for type.googleapis.com/envoy.api.v2.Cluster
[2018-06-28 09:54:45.379][19][debug][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:51] No stream available to sendDiscoveryRequest for type.googleapis.com/envoy.api.v2.Cluster
[2018-06-28 09:54:50.376][19][debug][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:36] Establishing new gRPC bidi stream for rpc StreamAggregatedResources(stream .envoy.api.v2.DiscoveryRequest) returns (stream .envoy.api.v2.DiscoveryResponse);
[2018-06-28 09:54:50.376][19][debug][router] external/envoy/source/common/router/router.cc:237] [C0][S7713306701516234573] cluster 'xds-grpc' match for URL '/envoy.service.discovery.v2.AggregatedDiscoveryService/StreamAggregatedResources'
[2018-06-28 09:54:50.376][19][debug][router] external/envoy/source/common/router/router.cc:284] [C0][S7713306701516234573] router decoding headers:
':method', 'POST'
':path', '/envoy.service.discovery.v2.AggregatedDiscoveryService/StreamAggregatedResources'
':authority', 'xds-grpc'
'content-type', 'application/grpc'
'te', 'trailers'
'x-envoy-internal', 'true'
'x-forwarded-for', '172.17.0.19'
':scheme', 'http'
. . .

Not sure if there's any plans to make the Envoy retry timeout delay configurable in the near future. It certainly seems like it would be useful to do so.

GrpcMuxImpl source available here.

@josephburnett
Copy link
Contributor

josephburnett commented Jun 28, 2018

Nice find George!

That initial invocation of GrpcMuxImpl::establishNewStream() fails.

Any idea why?

@tcnghia
Copy link
Contributor

tcnghia commented Jun 28, 2018

@georgeharley this is very nice finding. I've forwarded this to the Istio perf team to see if they can assist in customizing this retry.

@georgeharley
Copy link

@josephburnett no idea just yet. Will post here if I find anything.

@josephburnett
Copy link
Contributor

@georgeharley, actually will you post on #1345? We'll keep this issue to track the whole end-to-end startup latency. Thanks!

@dewitt
Copy link
Contributor

dewitt commented Jun 28, 2018

@josephburnett:

And I shared with elafros-dev@googlegroups.com as well.

Minor nit: we should be using knative-dev@googlegroups.com for ACLs now.

@tcnghia
Copy link
Contributor

tcnghia commented Jun 28, 2018

For the 3-5 seconds between the time that the Pod becomes Ready and the revision's Service is accessible: I found that if we switch to using a Istio virtual service instead of a real K8s service for the revisions, most of the time I see propagation delay of less than 1 second.

@josephburnett josephburnett added this to the Serving Sprint #1 milestone Jul 2, 2018
google-prow-robot pushed a commit that referenced this issue Jul 3, 2018
* Set the activator timeout to 60s.

There is a bug in Istio 0.8 preventing the timeout to set more than 15
seconds.  The bug is now fixed at HEAD, not yet released.  15 seconds
is too short for our 0->1 case (
see #1297).  This applies the
workaround suggested in http://github.com/istio/istio/issues/6230, to
set the activator timeout to 60s.

* Update typo
@mdemirhan
Copy link
Contributor

@josephburnett Let's make sure that there are tracking issues coming out of this work and we can close this uber issue.

@josephburnett josephburnett modified the milestones: Serving Sprint #1, Scaling: Sub-Second Cold Start Nov 14, 2018
@mattmoor mattmoor modified the milestones: Performance: Sub-Second Cold Start, Ice Box May 6, 2019
@josephburnett josephburnett removed their assignment May 20, 2019
@markusthoemmes
Copy link
Contributor

I think we have cut this down into several pieces meanwhile and can close this overarching issue. Please reopen if you feel differently.

/close

@knative-prow-robot
Copy link
Contributor

@markusthoemmes: Closing this issue.

In response to this:

I think we have cut this down into several pieces meanwhile and can close this overarching issue. Please reopen if you feel differently.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dprotaso dprotaso removed this from the Ice Box milestone Oct 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests