Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webhook config manager: HasSynced returns true when the manager is synced with existing webhookconfig objects at startup #95783

Merged

Conversation

roycaihw
Copy link
Member

@roycaihw roycaihw commented Oct 22, 2020

What type of PR is this?
/kind bug

What this PR does / why we need it:
Fix a webhook unittest flake. The test waits for informers to be synced, which guarantees the watch events being sent (the delta fifo queue in the informer has no initial object left), but it doesn't guarantee the webhook configuration manager finishing processing those watch events. Adding a marker to wait for webhook registration, like what we do in the e2e tests.

The original race can be reliably reproduced by adding a sleep to the configuration managers.

What happened: webhook config manager HasSynced() function returned true if the informer was synced, which could cause requests being allowed without existing webhook configurations being effective at startup.

[EDIT] Thanks to @liggitt who pointed out that this isn't just a test issue and suggested the proper fix-- on API server startup, the webhook waits to admit anything until it is synced with existing webhookconfig objects, meaning validatingWebhookConfigurationManager#HasSynced and mutatingWebhookConfigurationManager#HasSynced shouldn't return true until the informer is synced and either has no items or updateConfiguration() has completed. This fix guarantees that at no point should requests be allowed into a restarted API server without those previously created webhook configurations being effective.

Which issue(s) this PR fixes:

Fixes #94810, #99911, #100070

Does this PR introduce a user-facing change?:

Fixed a race condition on API server startup ensuring previously created webhook configurations are effective before the first write request is admitted.

/sig api-machinery
/cc @sttts

@k8s-ci-robot k8s-ci-robot requested a review from sttts Oct 22, 2020
@k8s-ci-robot k8s-ci-robot added release-note-none kind/bug size/L sig/api-machinery cncf-cla: yes needs-triage needs-priority area/apiserver labels Oct 22, 2020
@fedebongio
Copy link
Contributor

@fedebongio fedebongio commented Oct 22, 2020

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted and removed needs-triage labels Oct 22, 2020
pacoxu
pacoxu approved these changes Dec 17, 2020
@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Feb 12, 2021

/cc @liggitt

@k8s-ci-robot k8s-ci-robot requested a review from liggitt Feb 12, 2021
@roycaihw roycaihw force-pushed the flake/wait-for-webhook-registration branch 2 times, most recently from d3a21d8 to e124292 Compare Mar 17, 2021
@k8s-ci-robot k8s-ci-robot added size/M release-note and removed size/L release-note-none labels Mar 17, 2021
@roycaihw roycaihw changed the title webhook unittest: wait for webhook registration before running test cases webhook config manager: HasSynced returns true when the manager is synced with existing webhookconfig objects at startup Mar 17, 2021
@roycaihw roycaihw force-pushed the flake/wait-for-webhook-registration branch from e124292 to e0bc203 Compare Mar 18, 2021
@roycaihw
Copy link
Member Author

@roycaihw roycaihw commented Mar 18, 2021

/retest

@liggitt
Copy link
Member

@liggitt liggitt commented Mar 18, 2021

one comment on making this more efficient for servers without any webhooks configured, functionally lgtm otherwise.

…nced with existing webhookconfig objects at startup
@roycaihw roycaihw force-pushed the flake/wait-for-webhook-registration branch from e0bc203 to 37d171e Compare Mar 19, 2021
@roycaihw
Copy link
Member Author

@roycaihw roycaihw commented Mar 19, 2021

/retest

@liggitt
Copy link
Member

@liggitt liggitt commented Mar 23, 2021

/lgtm
/approve

/milestone v1.21
Fixes 3 of the top 10 unit test flakes

@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Mar 23, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm label Mar 23, 2021
@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Mar 23, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, pacoxu, roycaihw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved label Mar 23, 2021
@k8s-ci-robot k8s-ci-robot merged commit ae6ceaa into kubernetes:master Mar 23, 2021
14 checks passed
@roycaihw roycaihw deleted the flake/wait-for-webhook-registration branch Mar 23, 2021
k8s-ci-robot added a commit that referenced this issue Mar 23, 2021
…783-upstream-release-1.19

Automated cherry pick of #95783: webhook config manager: HasSynced returns true when the
k8s-ci-robot added a commit that referenced this issue Mar 23, 2021
…783-upstream-release-1.20

Automated cherry pick of #95783: webhook config manager: HasSynced returns true when the
k8s-ci-robot added a commit that referenced this issue Mar 23, 2021
…783-upstream-release-1.18

Automated cherry pick of #95783: webhook config manager: HasSynced returns true when the
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/apiserver cncf-cla: yes kind/bug lgtm needs-priority release-note sig/api-machinery size/M triage/accepted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants