Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate webhooks #14082

Merged
merged 4 commits into from Jul 7, 2023
Merged

Consolidate webhooks #14082

merged 4 commits into from Jul 7, 2023

Conversation

skonto
Copy link
Contributor

@skonto skonto commented Jun 7, 2023

Part of the work for #13965

Proposed Changes

Release Note

Domain mapping controller logic is now merged with the Serving controller. Domainmapping webhook is merged with the Serving webhook. 

@knative-prow knative-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/API API objects and controllers labels Jun 7, 2023
@knative-prow knative-prow bot requested review from jsanin-vmw and mgencur June 7, 2023 11:27
@knative-prow knative-prow bot added area/test-and-release It flags unit/e2e/conformance/perf test issues for product features approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 7, 2023
@skonto skonto requested review from dprotaso and removed request for mgencur and jsanin-vmw June 7, 2023 11:31
@codecov
Copy link

codecov bot commented Jun 7, 2023

Codecov Report

Patch coverage has no change and project coverage change: +0.04 🎉

Comparison is base (bde2f42) 86.23% compared to head (40a9e9f) 86.27%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14082      +/-   ##
==========================================
+ Coverage   86.23%   86.27%   +0.04%     
==========================================
  Files         199      199              
  Lines       14795    14795              
==========================================
+ Hits        12759    12765       +6     
+ Misses       1733     1730       -3     
+ Partials      303      300       -3     
Impacted Files Coverage Δ
cmd/controller/main.go 0.00% <ø> (ø)

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@skonto skonto requested a review from nak3 June 7, 2023 14:01
@@ -19,6 +19,10 @@ package main
import (
"context"

servingv1alpha1 "knative.dev/serving/pkg/apis/serving/v1alpha1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

domainmapping v1alpha1 is deprecated #14059 and will be removed after v1.11 cut #14060.
We should add v1alpha1 if this PR will be merged before v1.11 cut but it will be dropped very soon 😅

@skonto skonto changed the title [wip] Consolidate webhooks Consolidate webhooks Jun 14, 2023
@knative-prow knative-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 14, 2023
@@ -167,7 +242,9 @@ func main() {
sharedmain.WebhookMainWithContext(ctx, "webhook",
certificates.NewController,
newDefaultingAdmissionController,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just add domain mapping types to the newDefaultingAdmissionController and newValidationAdmissionController and you don't need to create new ones

Copy link
Contributor Author

@skonto skonto Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dprotaso I thought about that. My idea was to keep a separate controller per concern for now, to be able to have some independent configuration in the future eg. number of workers per queue, logging etc.
If we merge then should not we increase the concurrency for the common controller (default is 2)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we merge then should not we increase the concurrency for the common controller (default is 2)?

There really isn't an advantange here - this admission controller is reconciling a single resource

Copy link
Contributor Author

@skonto skonto Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but is not the case that the number of reconciled objects can be equal to the number of services, since a domain mapping can be set per service? So the load for the domain reconciler is not low as far as I can tell eg. during a restart. Should we merge the controllers and keep the concurrency defaults? Anyway I will do a quick test.

Copy link
Member

@dprotaso dprotaso Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - this controller just mutates this one resource here https://github.com/knative/serving/blob/main/config/core/webhooks/defaulting.yaml

So the load for the domain reconciler is not low as far as I can tell eg. during a restart. Should we merge the controllers and keep the concurrency defaults? Anyway I will do a quick test.

Increasing the # of controllers will bump the total QPS & Burst (it's code in the sharedmain) for the process. As for the reconciler there are two go routines each - so this doesn't change when moving a controller/reconciler from one go cmd to another.

Copy link
Contributor Author

@skonto skonto Jun 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the reconciler there are two go routines each - so this doesn't change when moving a controller/reconciler from one go cmd to another.

I was referring to the reconciliation work done and the #of workqueue workers, not the rate with which we hit the api server (the latter is fine as you said). Check here: https://github.com/knative/pkg/blob/main/controller/controller.go#L209. The default concurrency there is 2.
When a controller is started that value is passed to it https://github.com/knative/pkg/blob/main/controller/controller.go#L446. So if we move the domain mapping types for defaulting to the default controller then we will still have 2 routines but with more types to handle in case of reconciliation, no?
In detail, controller will do reconciliation in c.processNextWorkItem here: https://github.com/knative/pkg/blob/main/controller/controller.go#L487-L494, that function is called in a separate thread and total threads equals concurrency:

	for i := 0; i < threadiness; i++ {
		sg.Add(1)
		go func() {
			defer sg.Done()
			for c.processNextWorkItem() {
			}
		}()
	}

That function will call reconcile at some point: https://github.com/knative/pkg/blob/main/controller/controller.go#L542.
In other words, let's say we initially have two controllers, let's say: newDefaultingAdmissionController,
newDefaultingDomainMappingAdmissionController, each has 2 routines for its reconciler by default and a separate work queue for each.
If we merge these controllers then we have 2 routines by default but with potentially more keys enqueued and more work for the common work queue, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we move the domain mapping types for defaulting to the default controller then we will still have 2 routines but with more types to handle in case of reconciliation, no?

More types in the webhook != more resources that need reconciliation. We should only have one defaulting webhook and one validation webhook.

It's this one here

- apiGroups:
- autoscaling.internal.knative.dev
- networking.internal.knative.dev
- serving.knative.dev
apiVersions:
- "*"
operations:
- CREATE
- UPDATE
scope: "*"
resources:
- metrics
- podautoscalers
- certificates
- ingresses
- serverlessservices
- configurations
- revisions
- routes
- services

Actually we should be moving these values from this webhook over to the new one and just exclude the /status

resources:
- domainmappings
- domainmappings/status

If we merge these controllers then we have 2 routines by default but with potentially more keys enqueued and more work for the common work queue, no?

Generally we haven't cared about this but it's not an issue for the webhook controllers since it's only reconciling a single resource. There are better gains to make elsewhere - for example the revision controller still has a concurrency of two - maybe the serverlessservice (since it deals with endpoint updates) could be bumped higher.

But I'd say that's out of scope for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok let's postpone this discussion for another PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess to clarify I'm referring to the 'threadiness' per controller out of scope for this PR. But I still think it makes sense to consolidate domain mapping into the default webhook

@@ -24,7 +24,7 @@ webhooks:
- admissionReviewVersions: ["v1", "v1beta1"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete these webhooks since we can merge them into the default validation and defaulting ones

Copy link
Contributor Author

@skonto skonto Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes if we move everything to the default ones then we can delete it, also there will be no need for the special queue name. Should we? Still thinking about it. Check my comment above.

@skonto
Copy link
Contributor Author

skonto commented Jul 6, 2023

@dprotaso gentle ping. I updated the PR, now all webhooks are served by one controller.

@dprotaso
Copy link
Member

dprotaso commented Jul 7, 2023

/lgtm
/approve

thanks @skonto

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Jul 7, 2023
@knative-prow
Copy link

knative-prow bot commented Jul 7, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, skonto

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot merged commit fc166ac into knative:main Jul 7, 2023
62 checks passed
skonto added a commit to skonto/serving that referenced this pull request Oct 5, 2023
* consolidate webhooks

* pass the queue name update

* merge types, clean up webhook configs

* remove unused
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/test-and-release It flags unit/e2e/conformance/perf test issues for product features lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants