Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

421 test webhook #525

Merged
merged 8 commits into from
Dec 11, 2019
Merged

421 test webhook #525

merged 8 commits into from
Dec 11, 2019

Conversation

realshuting
Copy link
Member

This PR fix #421, to register the webhookconfigurations for policy and resource after verifying the webhook status. The check is performed against LastReqTime introduced in PR #418. It adds an additional check before creating the webhookconfigurations to verify the webhook status:

  • if the duration since LastReqTime is less than the deadline(60s by default), considered webhook is active, create webhookcfgs immediately
  • if the duration is greater than the deadline, it waits for a single cycle of webhook checkers(60s) to let the checker updates LastReqTime:
    • if updated duration is less than deadline, creates webhookcfgs
    • if time out, which means LastReqTime is not updated during last 60s, indicate the webhook is not active, return error

Copy link
Contributor

@shivdudhani shivdudhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion 1: Decouple the client and the caller
The policy controller is watching on webhookconfiguration resource, to do processing but now the registration client also watches on the resource. Moving the code inside webhookconfig adds dependency between policy, but this is not needed, as these checks can be performed in the policy controller and just inform the client to create or remove. Keeps the packages decoupled and cleaner.
Also if using a lister, we should always wait for the cache to sync the first time before using it as the behavior of the list before cache sync is not deterministic.

The checker should not be a part of the webhook registration client, but the checker logic is different from the client, so the client is responsible for CRUD operations and should avoid additional checks. While passing in the server we only pass the LastReqTime from the webhook registration client.

Suggestions 2: Policy controller should not be blocked for the verify webhook

With the flow in the main,

  • webhookRegistrationClient.Register
  • Then we start all the components(policy controller, webhook server)

While creating webhook we use waitForWebhookSuccess which uses exponential backoff mechanism to create resources(we start with 60 seconds and the factor is 2, so it doubles every step. and we have 7 steps) so backoff is 60, 120, 240, 480, 960,1920, seconds and so on.
This is blocking call so in the worst case can take 3840 seconds before any other components can start working. Is this exponential backoff needed here?

This verify is added to assist the admission control initialize cleanly, but there are 2 components the webhooks for requests and policy controller. With this block, the controller also has to wait for webhook registration to complete, while it's not dependent on it. So the policy controller cannot run in scanner mode. This needs to be decoupled.

…okinformer from policy controller and webhookregistrationclient
@realshuting
Copy link
Member Author

/hold

@realshuting
Copy link
Member Author

/hold cancel

@shivdudhani
Regards to suggestion 1, now the caller and the client are decoupled:

  • the caller (webhook server and policy controller) push to a queue when it sees the need to create the resource webhookconfiguration
  • there is a background task watches for the queue, when a creation request comes in, this task checks the webhook status:
    • if active: checks the existing webhookconfiguration, creates one if there's no webhookconfiguration; re-queue the request if error creating
    • if not active: drop the request as there's no point to create the webhookconfiguration when webhook is not up

The above logic runs the background in a separate gorountine, and when the caller calls to create the webhookconfiguration, all it does is push the request to the queue, won't block anything anymore.

@realshuting
Copy link
Member Author

@shivdudhani Thanks for pointing out the flaw!

I removed the channel and expose a method instead to create the webhookcfg, now there is only a single gorountine at max to create the cfg.
Also as discussed, we have this flag pendingCreation to indicate the creation status.

@realshuting realshuting merged commit f06b19b into v1.1.0 Dec 11, 2019
@realshuting realshuting deleted the 421_test_webhook branch December 11, 2019 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants