New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kube credentials lockfile to prevent possibility of excessive login attempts #26102
Conversation
// Take a lock while we're trying to issue certificate and possibly relogin | ||
unlock, err := utils.FSTryWriteLockTimeout(ctx, kubeCredLockfilePath, 5*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you supposed to hold a fs lock while doing remote I/O? utils.FSTryWriteLockTimeout
tries to grab the lock every 10 milliseconds, which seems kind of a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my measurements, it creates pretty small additional cpu load, around 0.5-1%. And also it's not a happy path, relevant only when we need to reissue a cert. So I think it's ok in this case.
@@ -662,6 +714,9 @@ func (c *kubeCredentialsCommand) run(cf *CLIConf) error { | |||
return trace.Wrap(err) | |||
} | |||
|
|||
// Unlock and remove the lockfile so subsequent tsh kube credentials calls don't exit early | |||
unlockKubeCred(true) | |||
|
|||
return c.writeKeyResponse(cf.Stdout(), k, c.kubeCluster) | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calling the unlock function multiple times, you can just capture the deleteKubeCredsLock
var using a closure and once it executes, the var is read from pointer before calling unlockKubeCred
var deleteKubeCredsLock = false
defer func(){
unlockKubeCred(deleteKubeCredsLock)
}()
tc, err := makeClient(cf, true)
if err != nil {
return trace.Wrap(err)
Expand All
@@ -628,6 +677,9 @@ func (c *kubeCredentialsCommand) run(cf *CLIConf) error {
}
if crt != nil && time.Until(crt.NotAfter) > time.Minute {
log.Debugf("Re-using existing TLS cert for Kubernetes cluster %q", c.kubeCluster)
// Unlock and remove the lockfile so subsequent tsh kube credentials calls don't exit early
deleteKubeCredsLock=true
return c.writeKeyResponse(cf.Stdout(), k, c.kubeCluster)
}
// Otherwise, cert for this k8s cluster is missing or expired. Request
Expand Down
Expand Up
@@ -662,6 +714,9 @@ func (c *kubeCredentialsCommand) run(cf *CLIConf) error {
return trace.Wrap(err)
}
// Unlock and remove the lockfile so subsequent tsh kube credentials calls don't exit early
deleteKubeCredsLock=true
return nil, trace.Wrap(err) | ||
} | ||
// Take a lock while we're trying to issue certificate and possibly relogin | ||
unlock, err := utils.FSTryWriteLockTimeout(ctx, kubeCredLockfilePath, 5*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the operation fails because another // instance already locked the file, should we return ErrKubeCredLockfileFound
?
57538bb
to
23fc31d
Compare
f7052e7
to
56bbd44
Compare
ec210db
to
bca65b0
Compare
862b37c
to
59fba9e
Compare
f7570ad
to
d03d33e
Compare
This PR adds locking mechanism for when
tsh kube credentials
is called. It had potential to create a lot of open browser tabs, when GUI tool like lens tries to run kube commands repeatedly after user session expired, and SSO login attempts don't succeed.We add a lock file that on one hand is used to synchronize parallel calls, but main usage is that we don't delete it if there was an error. So when subsequent
tsh kube credentials
run finds this file, it aborts to not cause excessive login attempts and returns an error asking user to login manually.Closes #22494
Closes #9450