Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v0.9] Reduce BundleDeployment triggering on deployed resources updates #2031

Merged
merged 13 commits into from Jan 17, 2024

Conversation

aruiz14
Copy link
Contributor

@aruiz14 aruiz14 commented Dec 20, 2023

Refers to #2058

These changes aim for reducing the number of time that the BundleDeployment reconciler is executed. This happens due to the Fleet Agent configuring a watcher on every deployed resources, which will trigger a requeue of the BundleDeployment handler in the agent for every watch event. Since this handler refreshes the conditions timestamp in the BundleDeployment status, this has a direct effect on the Fleet controller as well, which will react to the update in the BundleDeployment object.

This patch consists of the following changes:

  • The TriggerSleep delay has been removed, since it could cause modification events happening in that time window to be missed (this was the case for the integration tests). This is connected to the next change:
  • ADDED watch events are now ignored:
    • The informer creates an initial ADDED event for every object, but this does not make sense for objects that were just created.
    • The only interesting events are MODIFIED and DELETED, which are the ultimate interest of the trigger (detecting modified or deleted resources). If an object is deleted and later created by the user, the controller may still need to redeploy in order to achieve a stable situation.
  • Filters modified events based on Generation field, for those kinds that make use of it.
    • Kinds that do not implement this metadata field (e.g. ConfigMaps) will keep the existing behavior.
    • Resources that set the Generation field will be checked: based on their UID, if Generation didn't change from a previous event, it will be interpreted as the spec didn't change, so will skip the trigger, effectively omitting status updates.

@aruiz14 aruiz14 changed the title Reduce BundleDeployment triggering on deployed resources updates [v0.9] Reduce BundleDeployment triggering on deployed resources updates Jan 8, 2024
@raulcabello
Copy link
Contributor

I like the approach. This introduces two improvements:

  • Removing ADDED -> removes an initial enqueue that happens for all resources being watched as the informer creates an initial ADDED event for each resource.
  • Filters modified events based on Generation field -> removes a lot of enqueues that happens when status fields are changed. We ignore status changes when detecting drift in wrangler

I don't see any issue with this approach. However, there is a possibility that these extra enqueues are needed in some edge cases we don't see at the moment. So, I wouldn't add this to the stable release (0.8)

@aruiz14 aruiz14 marked this pull request as ready for review January 11, 2024 14:38
@aruiz14 aruiz14 requested a review from a team as a code owner January 11, 2024 14:38
Copy link
Contributor

@weyfonk weyfonk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for taking the time to add integration tests for this ☺️

internal/cmd/agent/trigger/watcher.go Outdated Show resolved Hide resolved
integrationtests/agent/watcher_trigger_test.go Outdated Show resolved Hide resolved
The watcher is stopped and started again in a short amount of time,
but the update in the test happens too fast for it to start watching
again.
@manno manno merged commit 2e5ef6c into rancher:release/v0.9 Jan 17, 2024
10 checks passed
aruiz14 added a commit to aruiz14/fleet that referenced this pull request Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants