Increase the max wait time for EndpointCreate API requests #25805

squeed · 2023-05-31T21:04:27Z

In the event that a node has a burst of PodSanboxCreate requests, API requests will pile up. However, we should wait longer for the queue to clear before short-circuting and returing failure.

This is because the kubelet has a relatively relaxed timeout for PodSandbox creation -- 4 minutes. Furthermore, if we return a failure here, it is propagated all the way back through containerd to kubelet, which will tear down the entire PodSandbox and try again, which can be expensive.

So, increase the maximum queue duration time to 1 minute. That should hopefully give enough time for the queue to clear.

Fixes: #24361

Cilium now waits longer before returning a failure in the event of a pod creation burst.

squeed · 2023-06-01T08:49:01Z

/test

thorn3r

nice investigation on this one 👍

squeed · 2023-06-09T07:52:32Z

/test

In the event that a node has a burst of PodSanboxCreate requests, API requests will pile up. However, we should wait longer for the queue to clear before short-circuting and returing failure. This is because the kubelet has a relatively relaxed timeout for PodSandbox creation -- 4 minutes. Furthermore, if we return a failure here, it is propagated all the way back through containerd to kubelet, which will tear down the entire PodSandbox and try again, which can be expensive. So, increase the maximum queue duration time to 1 minute. That should hopefully give enough time for the queue to clear. Signed-off-by: Casey Callendrello <cdc@isovalent.com>

squeed · 2023-06-19T14:03:30Z

/test

squeed · 2023-06-20T11:55:06Z

With regards to feature freeze, I argue this is fixing a known bug (and it tweaks a few configs, rather than introducing any sort of new behavior).

squeed added area/cni Impacts the Container Networking Interface between Cilium and the orchestrator. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels May 31, 2023

squeed requested a review from a team as a code owner May 31, 2023 21:04

squeed requested a review from thorn3r May 31, 2023 21:04

thorn3r approved these changes Jun 2, 2023

View reviewed changes

squeed force-pushed the endpointcreate-wait-longer branch from 5a9f0af to cd0b6d9 Compare June 9, 2023 07:52

christarazi added kind/enhancement This would improve or streamline existing functionality. area/daemon Impacts operation of the Cilium daemon. labels Jun 15, 2023

christarazi mentioned this pull request Jun 15, 2023

CI: flakes due to putEndpointIdTooManyRequests #24361

Closed

squeed force-pushed the endpointcreate-wait-longer branch from cd0b6d9 to 4ff1ef8 Compare June 19, 2023 14:03

maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 19, 2023

squeed added the kind/bug This is a bug in the Cilium logic. label Jun 20, 2023

borkmann merged commit c8eeebe into cilium:main Jun 20, 2023
61 of 62 checks passed

joestringer mentioned this pull request Jun 28, 2023

Prepare for release v1.14.0-rc.0 #26544

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase the max wait time for EndpointCreate API requests #25805

Increase the max wait time for EndpointCreate API requests #25805

squeed commented May 31, 2023

squeed commented Jun 1, 2023

thorn3r left a comment

squeed commented Jun 9, 2023

squeed commented Jun 19, 2023

squeed commented Jun 20, 2023

Increase the max wait time for EndpointCreate API requests #25805

Increase the max wait time for EndpointCreate API requests #25805

Conversation

squeed commented May 31, 2023

squeed commented Jun 1, 2023

thorn3r left a comment

Choose a reason for hiding this comment

squeed commented Jun 9, 2023

squeed commented Jun 19, 2023

squeed commented Jun 20, 2023