-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for all informers to sync in /readyz. #92644
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
To copy some context from the original PR:
|
3c8fedc
to
0b662f4
Compare
/cc @mborsz - for posterity |
@wojtek-t: GitHub didn't allow me to request PR reviews from the following users: -, for, posterity. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
just a couple minor comments.
if err := s.addReadyzChecks(healthz.NewInformerSyncHealthz(c.SharedInformerFactory)); err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be moved up between lines 616-617, since we only want to add the readyz check once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
var status int | ||
res.StatusCode(&status) | ||
raw, err := res.Raw() | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to do this, then we should probably just fully convert this function so that it doesn't return the bool.
if err != nil || status != http.StatusOK {
t.Fatalf("got %v but wanted 200, error: %v", status, err)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then the calls on 121 and 124 don't have to be in conditional form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the function to return error actually
started := i.sharedInformerFactory.WaitForCacheSync(stopCh) | ||
klog.V(4).Infof("SharedInformerFactory.WaitForCacheSync finished with: %v", started) | ||
var notStarted []string | ||
for informType, started := range started { | ||
if !started { | ||
klog.Warningf("informer for %v has not synced yet", informType.String()) | ||
notStarted = append(notStarted, informType.String()) | ||
} | ||
} | ||
if len(notStarted) != 0 { | ||
return fmt.Errorf("%d informers not started yet: %v", len(notStarted), notStarted) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a huge nit (sorry):
stopCh := make(chan struct{})
// Close stopCh to force checking if informers are synced now.
close(stopCh)
var informersByStarted map[bool][]string
for informType, started := range i.sharedInformerFactory.WaitForCacheSync(stopCh) {
informersByStarted[started] = append(informersByStarted[started], informType.String())
}
if len(informersByStarted[false]) != 0 {
return fmt.Errorf("%d informers have not started yet: %v (but %v informers have started)", len(notStarted), informersByStarted[false], informersByStarted[true])
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
fa2850c
to
494f2a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@logicalhan sorry - fixed one more typo - if you could re-lgtm /hold cancel |
494f2a6
to
c2019d9
Compare
c2019d9
to
252500f
Compare
/lgtm |
/retest Review the full test history for this PR. Silence the bot with an |
/retest |
252500f
to
3f68000
Compare
New changes are detected. LGTM label has been removed. |
Just fixed the minor compile error in integration test - reaaplying the label. |
/retest |
@wojtek-t: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
…#92644-origin-release-1.16 [1.16] Cherry pick of #92644: Wait for all informers to sync in /readyz.
…#92644-origin-release-1.18 [1.18] Cherry pick of #92644: Wait for all informers to sync in /readyz.
…#92644-origin-release-1.17 [1.17] Cherry pick of #92644: Wait for all informers to sync in /readyz.
Based on #92508 by @mborsz
What type of PR is this?
/kind bug
What this PR does / why we need it:
Context: #92506
In large clusters, informers in kube-apiserver may require significant time (30-50s) to initialize. Before this happens, kube-apiserver is not able to answer some requests (e.g. node authorizer is not able to accept any request).
This PR adds a way to determine if informers in kube-apiserver are already synced. This check is added to /readyz which can be used to determine if traffic can be sent to particular kube-apiserver.
Ref: #92506