Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: enqueue a single request for all resources #1813

Merged
merged 3 commits into from
Aug 22, 2023

Conversation

dboslee
Copy link
Contributor

@dboslee dboslee commented Aug 21, 2023

What this PR does / why we need it:

The current behavior of the controller is to run Reconcile for each resource that is updated. This gets quite expensive when you have 100s-1000s of resources.

This change allows for multiple resource updates to result in a single run of Reconcile.

This greatly reduces the cpu/memory consumption at startup and during periodic re-syncs of all resources.

Which issue(s) this PR fixes:

Fixes #1797

Signed-off-by: David Boslee <david@goteleport.com>
@dboslee dboslee requested a review from a team as a code owner August 21, 2023 22:51
Signed-off-by: David Boslee <david@goteleport.com>
@dboslee dboslee changed the title feat: enqueue a single request for all resources fix: enqueue a single request for all resources Aug 21, 2023
predicate.NewPredicateFuncs(r.httpRoutesForAuthenticationFilter)); err != nil {
return err
}

// Watch RateLimitFilter CRUDs and enqueue associated HTTPRoute objects.
if err := c.Watch(
source.Kind(mgr.GetCache(), &egv1a1.RateLimitFilter{}),
&handler.EnqueueRequestForObject{},
handler.EnqueueRequestsFromMapFunc(r.enqueueClass),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also do this for L1318 ?

@@ -147,7 +147,7 @@ func newResourceMapping() *resourceMappings {
}

func (r *gatewayAPIReconciler) Reconcile(ctx context.Context, request reconcile.Request) (reconcile.Result, error) {
r.log.WithName(request.Name).Info("reconciling object", "namespace", request.Namespace, "name", request.Name)
r.log.WithName(request.Name).Info("reconciling gateways")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a comment / doc string mentioning that all watched resources trigger this single Reconcile . Also can we discard _ reconcile.Request with a comment mentioning that the requests content dont map to a specific resource.

@codecov
Copy link

codecov bot commented Aug 21, 2023

Codecov Report

Merging #1813 (aaa25af) into main (4285f4e) will increase coverage by 0.10%.
The diff coverage is 90.32%.

@@            Coverage Diff             @@
##             main    #1813      +/-   ##
==========================================
+ Coverage   65.11%   65.21%   +0.10%     
==========================================
  Files          86       86              
  Lines       12316    12318       +2     
==========================================
+ Hits         8019     8033      +14     
+ Misses       3782     3774       -8     
+ Partials      515      511       -4     
Files Changed Coverage Δ
internal/provider/kubernetes/controller.go 48.84% <90.32%> (+1.20%) ⬆️

... and 2 files with indirect coverage changes

@arkodg
Copy link
Contributor

arkodg commented Aug 21, 2023

this is a neat idea, thanks for implementing this @dboslee, trying to understand this better, by using EnqueueRequestsFromMapFunc we've managed to create a specific Reconcile request with the same name for all Watch cases, which get coalesced / rate limited by the controller (since the name is same ) so we end up getting 1 reconcile request within a specific duration ?

Signed-off-by: David Boslee <david@goteleport.com>
@dboslee
Copy link
Contributor Author

dboslee commented Aug 22, 2023

this is a neat idea, thanks for implementing this @dboslee, trying to understand this better, by using EnqueueRequestsFromMapFunc we've managed to create a specific Reconcile request with the same name for all Watch cases, which get coalesced / rate limited by the controller (since the name is same ) so we end up getting 1 reconcile request within a specific duration ?

I think you've got it more or less. There isn't a specific duration here other than the rate limiting provided by the controller. Here are a few examples describing the behavior.

At startup a single call to reconcile is executed since the controller will wait for all caches to initialized and events are coalesced.

In general, during a re-sync of many resources, any events that are processed before Reconcile is called are coalesced. If additional resource events arrive while the call to Reconcile is still running they will be coalesced in the next call to Reconcile.

The time in between calls to Reconcile is enforced by the rate limiters but its pretty quick by default if not immediate when events occur.

Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for introducing this design pattern to improve performance at scale !

@arkodg
Copy link
Contributor

arkodg commented Aug 22, 2023

this is a neat idea, thanks for implementing this @dboslee, trying to understand this better, by using EnqueueRequestsFromMapFunc we've managed to create a specific Reconcile request with the same name for all Watch cases, which get coalesced / rate limited by the controller (since the name is same ) so we end up getting 1 reconcile request within a specific duration ?

I think you've got it more or less. There isn't a specific duration here other than the rate limiting provided by the controller. Here are a few examples describing the behavior.

At startup a single call to reconcile is executed since the controller will wait for all caches to initialized and events are coalesced.

In general, during a re-sync of many resources, any events that are processed before Reconcile is called are coalesced. If additional resource events arrive while the call to Reconcile is still running they will be coalesced in the next call to Reconcile.

The time in between calls to Reconcile is enforced by the rate limiters but its pretty quick by default if not immediate when events occur.

ah makes sense, thanks for the explanation !

@arkodg arkodg requested review from a team, kflynn, qicz and LanceEa and removed request for a team August 22, 2023 00:21
Copy link
Contributor

@zirain zirain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@zirain zirain merged commit 453d0c2 into envoyproxy:main Aug 22, 2023
18 checks passed
@dboslee dboslee deleted the david/reduce-reconciles branch August 22, 2023 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce resource utilization during startup and resyncs
3 participants