-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not N^2 loading webhook configurations #114794
Conversation
I'll add corresponding changes to the mutating version of this file once someone looks at this. |
/sig api-machinery |
The test fail is real, the goroutine doesn't get cleaned up. |
7b76004
to
23ee5d9
Compare
0ca4937
to
5d16e01
Compare
) | ||
|
||
// validatingWebhookConfigurationManager collects the validating webhook objects so that they can be called. | ||
type validatingWebhookConfigurationManager struct { | ||
configuration *atomic.Value | ||
lister admissionregistrationlisters.ValidatingWebhookConfigurationLister | ||
hasSynced func() bool | ||
delayer *synctrack.Delayer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you make this a pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with the configuration
member above, which also doesn't need to be a pointer...
go func() { | ||
defer wg.Done() | ||
for i := 0; i < adds; i++ { | ||
d.Add() | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes a difference, but you're documenting that Add may be called in parallel.
, so maybe you could have called them in parallel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
5d16e01
to
bfb4ce2
Compare
having
But the only known failure mode is actually the lister.List call itself: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/admission/configuration/validating_webhook_manager.go#L77-L84 With the new ability to ensure processing of the initial list has happened (your previous hasSync'd fix) and the synchronous processing done in the add/update/delete func, is there still value in having the |
I think it was probably just done this way for simplicity? But yes, it could be changed to not list. I think that would be more invasive? This change is pretty small. And waiting and then listing once is probably more efficient on startup than doing N inserts (not that it matters much for this code -- it's about the precedent for me). |
Which precedent do we want? Prior to your PR that tracks "have all event handlers handled", this was the state of the art. After your PR, it is possible to build reliable tracking without the list and I think that combined with a periodic consistency checker, is likely how I would suggest people handle it. Would you suggest using a list plus a delay instead? |
bfb4ce2
to
f7e985b
Compare
0c47f53
to
60f134b
Compare
fbb0c23
to
0e8dfe4
Compare
/remove-sig instrumentation |
ac3d387
to
3a0d86f
Compare
limitations under the License. | ||
*/ | ||
|
||
package synctrack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making the test external to make it clear that you're not using anything internal for the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
func (z *Lazy[T]) Get() (T, error) { | ||
e := z.cache.Load() | ||
if e == nil { | ||
// Since we don't force a constructor, nil is a possible value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catche! (typo intended 😂 )
/lgtm |
LGTM label has been added. Git tree hash: c71b82421fff3463a7870a49e2f3c4cbb662f5bd
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: apelisse, lavalamp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Add a "lazy" type to track when an update is needed. It uses a nested locking technique to avoid extra evaluation calls.
3a0d86f
to
5a1091d
Compare
/lgtm |
LGTM label has been added. Git tree hash: 2d1ecb82975186122e94b77b4c2d206724b92e7c
|
/hold cancel |
This adds a processing delay to deduplicate notifications to reload webhook configurations. Solves N^2 behavior on startup and prevents useless work when webhooks change in rapid succession.
What type of PR is this?
/kind bug
/kind cleanup
What this PR does / why we need it:
Fixes N^2 behavior I noticed while working on #113985.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: