New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update expiration timeout based on observed latencies #7628
Conversation
also @lavalamp though I don't think he's around today |
@@ -42,20 +42,28 @@ var ( | |||
|
|||
const ( | |||
// We'll attempt to recompute the required replicas of all replication controllers | |||
// the have fulfilled their expectations at least this often. | |||
// the have fulfilled their expectations at least this often. This recomputation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: s/the/that
While basing the timeout on observed latency is definitely pragmatic at the moment, a 6 minute timeout to create and observe 3000 pods is excessive (3000/6/60 = 8.3 pods/second). LGTM for now, but I'd like @davidopp and @lavalamp to confirm what the end goal is here w.r.t. watch latency, and file an issue to reduce this once the seemingly excessive watch latency has been reduced. LGTM |
Update expiration timeout based on observed latencies
It's not 6 minutes, it's 3.5. 6 is how long we wait before declaring the network unreliable. You can't get much better than ~2 with 30qps and 3000 pods. |
I don't understand how you arrive at 3.5 minutes. Also, where does the 30 QPS come from? |
https://github.com/GoogleCloudPlatform/kubernetes/pull/7628/files#diff-9d91a215f035d4101d00028b6a0bc08eR63 and ratelimiting. Actually it gets even better, we allow a smooth 20qps and bursts of 30. Anyway the scheduler will go at around 10pods/s. |
instead of increasing this timeout, I'd suggest capping/limiting the number of pods created at once. So e.g. don't let more than 500 creation/deletion requests per RC be outstanding at once. |
A very good suggestion and one that should be relatively easy to implement given the current system of expectations. We'll just create 500 and put the rc to sleep, when it wakes up it can create another 500. |
+1 I agree it's a very good idea |
Does #6866 (comment)
To clarify, the latency of creating/watching a single pod isn't in minutes (and won't be even with 100 nodes), but creating 3000 pods at once and being notified of them all via watch is. The expiration timeout was set based on the rate limit qps, this PR bumps it up to the observed latency in a larger cluster.
#7561
@davidopp @quinton-hoole @wojtek-t , whichever one of you is around today.