Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Paginate watch cache->etcd List calls & reflector init/resync List calls not served by watch cache #75389
Paginate requests from the kube-apiserver watch cache to etcd in chunks. Also, paginate reflector init and resync List calls that are not served by watch cache.
Fixing this will allow runaway process that create resources indefinitely to reach even high resource counts. And at some point, it will take longer to list the resources than the etcd compaction interval allows (default of 5 minutes), at which point the list operations will fail with "resource expired" errors. This is arguably a more graceful failure mode than we have today where the kube-apiserver becomes entirely unavailable, but it still can result in broken controllers (or the watch cache consuming excessive memory?). We can add a metric on the duration of the watch cache list operations, which could be used to monitor and alert on the list operation duration if it gets too high. We are testing to see what failure mode actually occurs with this fix in place when adding resources indefinitely.
changed the title
[WIP] Paginate watch cache->etcd List calls & reflector init/resync List calls not served by watch cache
Mar 19, 2019
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: jpbetz
If they are not already assigned, you can assign the PR to them by writing
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing