-
Notifications
You must be signed in to change notification settings - Fork 1.4k
🐛fix ClusterCache doesn't pick latest kubeconfig secret proactively #12400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @mogliang! |
Hi @mogliang. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
4142631
to
49bbe5d
Compare
added unit test |
/ok-to-test |
@@ -48,6 +49,14 @@ type createConnectionResult struct { | |||
Cache *stoppableCache | |||
} | |||
|
|||
func (ca *clusterAccessor) getKubeConfigSecret(ctx context.Context) (*corev1.Secret, error) { | |||
kubeconfigSecret, err := secret.Get(ctx, ca.config.SecretClient, ca.cluster, secret.Kubeconfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This leads to a huge number of get secret calls to the apiserver if someone uses an uncached client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that would be a problem, do you have any suggestions? shall we make this configurable?
@@ -511,6 +512,16 @@ func (cc *clusterCache) Reconcile(ctx context.Context, req reconcile.Request) (r | |||
requeueAfterDurations = append(requeueAfterDurations, accessor.config.HealthProbe.Interval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are hitting tooManyConsecutiveFailures in your scenario, right?
Would it also be enough for your use case to make HealthProbe.Timeout/Interval/FailureThreshold configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really.
We have proxy between mgmt cluster and target clusters.
we see after kubeconfig updated (proxy address changed), the existing connection (clustercache probe) still works, so it doesn't refetch kubeconfig and still cache the old one, but new connections fails (etcd client)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if the health check would open a new connection it would detect it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's correct~ another idea is to disconnect for a given time (e.g. 5m) period to force refresh connection. will this be better?
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #12399
/area clustercache