New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machine ID: Share a single bot identity auth client rather than creating multiple clients #38398
Conversation
@@ -292,7 +299,10 @@ func (s *identityService) renew( | |||
if s.cfg.Onboarding.RenewableJoinMethod() { | |||
// When using a renewable join method, we use GenerateUserCerts to | |||
// request a new certificate using our current identity. | |||
authClient, err := clientForIdentity(ctx, s.log, s.cfg, currentIdentity, s.resolver) | |||
// We explicitly create a new client here to ensure that the latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the one place where we actually need to ensure we have a client which is using the latest identity - so we explicitly create the client. This is because of the generation counter stored in the x509 cert which is compared with the one on the user during the certificate renewal process. If an older identity is still in use by a client, the bot will be locked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good to me just one question about client ownership: Should we prevent the shared client from being closable? In the tsh client handling code we have a psuedo implementation of auth.ClientI that looks something like the following to prevent this:
type sharedClient struct {
auth.ClientI
}
func (s sharedClient) Close() error {
return nil
}
Hmm - I'm not sure I'm the biggest fan of this. It feels like if something is calling .Close when it shouldn't - that's a bug that should be caught and addressed - not one that silently is swallowed and hidden, leaving a .Close call that does nothing. This seems the opposite of a predictable behaviour for an engineer and could lead to resource leaks etc in future when someone expects .Close to close the client, but it doesn't. |
I agree the proposed solution is hacky, but, in practice catching places where a shared client is unintentionally closed isn't something the compiler can detect. We should at least add a comment to GetClient that the shared client should not be closed. |
Going to spin up the tbot test ground to stability and performance test this vs the latest 15 release - I'll leave this cooking for about a week before merging. |
Going to merge a pr to fix the otel memory leak first before resuming this one. |
adcd4da
to
6621b16
Compare
I'm happy with how this has run over the weekend so I've rebased out the release commits and added some godocs. PTAL @rosstimothy |
@strideynet See the table below for backport results.
|
For a while, the way we have handled the fact that the tbot bot identity renews regularly is by creating short-lived clients with the current identity when a client is needed. This reduces performance and reliability as many new connections must be established and maintained to the Auth Server. This PR switches most usages of the Bot identity to share a single client which handles the rotation of the identity internally - there remains one place where the client must be short lived and that is for the bot identity renewal itself, due to the generation counter stored within the bot identity.
This should reduce the resource consumption of
tbot
and improve the overall performance.changelog: Improved reliability and performance of
tbot
.