New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClusterMesh/KVStoreMesh Readiness and Bootstrap QPS #30361
Conversation
185578a
to
aacf0d6
Compare
/test |
aaaa15e
to
518cf09
Compare
/test |
518cf09
to
91c9767
Compare
/test |
install/kubernetes/cilium/templates/clustermesh-apiserver/deployment.yaml
Outdated
Show resolved
Hide resolved
install/kubernetes/cilium/templates/clustermesh-apiserver/deployment.yaml
Show resolved
Hide resolved
Add a 'synced' field to remoteCluster in kvstoremesh to track its synchronization to the local kvstore. This will be used in a following commit to add a ReadinessProbe to the kvstoremesh container. A timeout on remote cluster connection is implemented to prevent blocking readiness forever if we are unable to establish a connection to a remote cluster. By default this is set to 15s and can be configured via the option "per-cluster-ready-timeout". Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Clustermesh-apiserver's Health API Server has been moved to a new 'health' package and modularized for reuse with KVStoreMesh. A slice of EndpointFunc must be provided to the HealthAPIServerCell to be registered as endpoints the Health API will serve. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
The /healthz endpoint for ClusterMesh is not currently used. If someone were to configure a livenessProbe targeting this endpoint, it would cause clustermesh-apiserver to fail if the kube-apiserver was unhealthy. This would prevent new agents in other clusters from being able to synchronize data. For now, we will remove this endpoint to prevent this scenario. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Make common.ClusterMesh an interface so that it can be mocked in integration tests. The original type is now an exported struct, common.clusterMesh. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Register a '/readyz' endpoint for KVStoreMesh that exposes the status of the initial synchronization of remote clusters to the local kvstore. The endpoint will return a 500 until the synchronization has completed, at which point it will respond with 200. A new cell added to KVStoreMesh executes a OneShot job to wait for remote cluster synchronization and signal completion to the HealthAPIServer via the shared SyncState. There are 2 configurable timeouts to prevent an unreachable cluster or synchronization error from blocking readiness forever and potentially causing a cascading failure as more KVStoreMesh instances restart: - 'per-cluster-ready-timeout' is the duration to wait for a remote cluster connection before removing it from the readiness check. - 'global-ready-timeout' is the duration to wait for all remote clusters to synchronize before cancelling the wait and reporting ready. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, lgtm!
Add a readinessProbe to the kvstoremesh container in the clustermesh-apiserver deployment utilizing the '/readyz' endpoint of the health API. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
The following commit will add new functionality to APILimiter and coverage with tests using testify. To avoid dropping in new tests that don't match the rest of the suite, all of the pkg/rate tests have been refactored to use the testify framework. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Add 2 new methods to APILimiter that allow the RateLimit and BurstLimit to be adjusted externally. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Add a new kvstore-opt 'etcd.bootstrapQps' to configure an initial client rate limit to use until the caller signals that bootstrap is complete. This is used in ClusterMesh and KVStoreMesh to allow setting a higher rate limit during the initial synchronization to etcd. To prevent changing behavior for other components using etcd clients, this option will only affect components providing a channel via extraOptions.BootstrapComplete, which should be closed to signal to the client that bootstrap is complete. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Remove duplicate import of clustermesh/types from clustermesh/common Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
Similar to KVStoreMesh, the settings for clustermesh-apiserver's readinessProbe should be configurable via helm values. Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
/test |
Should this be |
Looks like there's still three unresolved conversations that are blocking the merge. The interface at the bottom also reports that tophat review is necessary, but the codeowners checks in the top right don't specify that. Not sure if it's required, but I can check back later once the unresolved conversations are resolved and provide a rubber stamp if necessary. |
Thanks @joestringer, resolved those 👍 Looks like it is still requiring tophat review, and interestingly:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Providing the required 0 reviews to satisfy our GitHub overlords.
(We may want to update CODEOWNERS for pkg/rate
if we expect further changes there, would skip tophat requirement in future)
This adds a readiness probe to the kvstoremesh container which will report ready once all remote clusters have been synchronized. A per-cluster timeout on initial sync prevents a deadlock in a situation where the pod is restarted simultaneously on multiple clusters. In addition to the per-cluster timeout, a global timeout is also implemented to catch any other failures.
To increase the speed of the initial synchronization for both clustermesh-apiserver and kvstoremesh, a new kvstore-opt was added,
etcd.bootstrapQps
, which allows an initial rate-limit to be set for the etcd client. Once the container is ready, the standard rate-limit configured viaetcd.qps
is applied.etcd.bootstrapQps
is set to 10,000 for both containers in the chart.Some rough benchmarking with simulated load provided promising results:
For clustermesh-apiserver,
time-to-ready
dropped from ~500s to ~14s (roughly a 60x improvement)For kvstoremesh,
time-to-ready
dropped from ~100s to 13s (roughly a 10x improvement)Additional details are in each commit message.