-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hold Auth init lock for the duration of initialization #29593
Conversation
Auth now uses `RunWhileLocked` instead of `AcquireLock` to ensure that the initialization lock is held until the bootstrapping process is completed. Prior, Auth only held the lock for 30s which could allow multiple Auths to attempt bootstrapping simultaneously. Initialization should complete prior to 30s in most cases, but it is not guarateed, especially on first boot when CAs are being generated and a large data migration may be needed.
Backend: cfg.Backend, | ||
LockName: domainName, | ||
TTL: 30 * time.Second, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does RunWhileLocked
release the lock as soon as the operation is completed, or only after the TTL elapses?
If the former, what's the harm in making this (for example) 60s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the lock is released before returning from RunWhileLocked
:
teleport/lib/backend/helpers.go
Lines 223 to 229 in 7889f56
// lock.Release should be called with separate ctx. If someone cancels via ctx | |
// RunWhileLocked method, we want to at least try releasing lock. | |
releaseLockCtx, releaseLockCancel := context.WithTimeout(context.Background(), cfg.ReleaseCtxTimeout) | |
defer releaseLockCancel() | |
if err := lock.Release(releaseLockCtx, cfg.Backend); err != nil { | |
return trace.NewAggregate(fnErr, err) | |
} |
In the event that the Auth instance holding the lock is terminated during initialization a longer TTL would block any other instances from initializing until they could acquire the lock. The longer TTL would result in less backend activity caused by lengthy migrations though.
@@ -320,7 +333,7 @@ func Init(ctx context.Context, cfg InitConfig, opts ...ServerOption) (*Server, e | |||
// singletons). However, we need to keep them around while Telekube uses them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is Telekube?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷 - I have intentionally left the legacy/odd things in Init
alone while focusing on improving start up time and reliability of initialization. I was planning to circle back to cfg.Roles
, cfg.Authorities
, cfg.ReverseTunnels
and migrateLegacyResources
to determine if they can be eliminated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that was the name of teleport when it was running as a component of Gravity. Some deep cuts.
Friendly ping @ravicious @Tener |
@rosstimothy See the table below for backport results.
|
Auth now uses `RunWhileLocked` instead of `AcquireLock` to ensure that the initialization lock is held until the bootstrapping process is completed. Prior, Auth only held the lock for 30s which could allow multiple Auths to attempt bootstrapping simultaneously. Initialization should complete prior to 30s in most cases, but it is not guarateed, especially on first boot when CAs are being generated and a large data migration may be needed.
Auth now uses `RunWhileLocked` instead of `AcquireLock` to ensure that the initialization lock is held until the bootstrapping process is completed. Prior, Auth only held the lock for 30s which could allow multiple Auths to attempt bootstrapping simultaneously. Initialization should complete prior to 30s in most cases, but it is not guarateed, especially on first boot when CAs are being generated and a large data migration may be needed.
* Add configuration with optional timeout to AcquireLock (#24559) * Add configuration with optional timeout to AcquireLock * rename to RetryInterval * backport RunWhileLocked changes from #25639 * Hold Auth init lock for the duration of initialization (#29593) Auth now uses `RunWhileLocked` instead of `AcquireLock` to ensure that the initialization lock is held until the bootstrapping process is completed. Prior, Auth only held the lock for 30s which could allow multiple Auths to attempt bootstrapping simultaneously. Initialization should complete prior to 30s in most cases, but it is not guarateed, especially on first boot when CAs are being generated and a large data migration may be needed. --------- Co-authored-by: Tobiasz Heller <14020794+tobiaszheller@users.noreply.github.com>
* Add configuration with optional timeout to AcquireLock (#24559) * Add configuration with optional timeout to AcquireLock * rename to RetryInterval * backport RunWhileLocked changes from #25639 * Hold Auth init lock for the duration of initialization (#29593) Auth now uses `RunWhileLocked` instead of `AcquireLock` to ensure that the initialization lock is held until the bootstrapping process is completed. Prior, Auth only held the lock for 30s which could allow multiple Auths to attempt bootstrapping simultaneously. Initialization should complete prior to 30s in most cases, but it is not guarateed, especially on first boot when CAs are being generated and a large data migration may be needed. --------- Co-authored-by: Tobiasz Heller <14020794+tobiaszheller@users.noreply.github.com>
Auth now uses
RunWhileLocked
instead ofAcquireLock
to ensure that the initialization lock is held until the bootstrapping process is completed. Prior, Auth only held the lock for 30s which could allow multiple Auth instances to attempt bootstrapping simultaneously. Initialization should complete prior to 30s in most cases, but it is not guaranteed, especially on first boot when CAs are being generated and a large data migration may be needed.