Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ignore-resources-tracking annotation to ignore resources update #18343

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

kahoulei
Copy link

@kahoulei kahoulei commented May 21, 2024

This is the enhancement request to allow user ignore any resource update.

Previously we have a feature ignoreResourceUpdatesEnabled which allow user to ignore certain part of the resource manifest. But that feature does not fulfill the use case of newly created object.

When an dependent object is created, there is no old manifest to compare and therefore we always refresh in such case. This is wasting a lot of compute cycle if a cluster has a lot of dependent objects (e.g. jobs and pods created by cronjobs).

Instead, this PR introduces an annotation argocd.argoproj.io/ignore-resources-tracking so that user can use it on a specific resource and argocd can completely ignore it.

Also, this annotation will only work behind the same ignoreResourceUpdatesEnabled flag.

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • The title of the PR states what changed and the related issues number (used for the release note).
  • The title of the PR conforms to the Toolchain Guide
  • I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • Does this PR require documentation updates?
  • I've updated documentation as required by this PR.
  • I have signed off all my commits as required by DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My build is green (troubleshooting builds).
  • My new feature complies with the feature status guidelines.
  • I have added a brief description of why this PR is necessary and/or what this PR solves.
  • Optional. My organization is added to USERS.md.
  • Optional. For bug fixes, I've indicated what older releases this fix should be cherry-picked into (this may or may not happen depending on risk/complexity).

@kahoulei kahoulei requested a review from a team as a code owner May 21, 2024 21:42
@kahoulei
Copy link
Author

kahoulei commented May 21, 2024

cc @agaudreault

@kahoulei kahoulei requested a review from a team as a code owner May 22, 2024 15:02
@@ -67,6 +68,9 @@ const (

// EnvClusterCacheRetryUseBackoff is the env variable to control whether to use a backoff strategy with the retry during cluster cache sync
EnvClusterCacheRetryUseBackoff = "ARGOCD_CLUSTER_CACHE_RETRY_USE_BACKOFF"

// AnnotationIgnoreResourcesTracking is a Kubernetes annotation for a Kubernetes resource to ignore any resources tracking
AnnotationIgnoreResourcesTracking = "argocd.argoproj.io/ignore-resources-tracking"
Copy link
Member

@agaudreault agaudreault May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the same terminology. resource tracking means it does not show up in Argo UI and this would lead to confusion

Suggested change
AnnotationIgnoreResourcesTracking = "argocd.argoproj.io/ignore-resources-tracking"
AnnotationIgnoreResourcesUpdate = "argocd.argoproj.io/ignore-resources-update"

Comment on lines 169 to 170
// annotations stores all the ObjectRef annotations
annotations map[string]string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can save memory here and just have a bool instead. annotations like kubectl.kubernetes.io/last-applied-configuration can end up consuming a lot if you consider the number of resources argo watches

@@ -342,6 +349,14 @@ func skipAppRequeuing(key kube.ResourceKey) bool {
}

func skipResourceUpdate(oldInfo, newInfo *ResourceInfo) bool {
if val, ok := newInfo.annotations[AnnotationIgnoreResourcesTracking]; ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should ignore only if the health status did not change.

@@ -549,7 +564,7 @@ func (c *liveStateCache) getCluster(server string) (clustercache.ClusterCache, e
"name": ref.Name,
"api-version": ref.APIVersion,
"kind": ref.Kind,
}).Debug("Ignoring change of object because none of the watched resource fields have changed")
}).Debugf("Ignoring change of object because none of the watched resource fields have changed or annotation %v is set to true", AnnotationIgnoreResourcesTracking)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the debug message adds much information. I would either leave it as-is, or perhaps add a log field with the values of the annotation. This way we can query logs to know what was ignored because of annotations.

Suggested change
}).Debugf("Ignoring change of object because none of the watched resource fields have changed or annotation %v is set to true", AnnotationIgnoreResourcesTracking)
}).Debug("Ignoring change of object because none of the watched resource fields have changed")

@agaudreault
Copy link
Member

agaudreault commented May 31, 2024

@kahoulei Something I was thinking but had not the time to complete was to do the same, but with an "allow" annotation instead of a "ignore". Basically, argocd.argoproj.io/apply-resources-updates: "true". When this annotation is specified, it performs the same logic as we have today: generating the hash based on configuration.

The advantage was that you dont have to always ignore the full resources, but you can laverage the ignoreResourceUpdates configurations.

@kahoulei
Copy link
Author

kahoulei commented Jun 4, 2024

@agaudreault thanks for the review.

If I understand correctly, for the resource that we want to ignore we should set argocd.argoproj.io/apply-resources-updates: "false". Is it correct?

If argocd.argoproj.io/apply-resources-updates: "true", we will just compare the generated hash as is.

@agaudreault
Copy link
Member

agaudreault commented Jun 4, 2024

@agaudreault thanks for the review.

If I understand correctly, for the resource that we want to ignore we should set argocd.argoproj.io/apply-resources-updates: "false". Is it correct?

If argocd.argoproj.io/apply-resources-updates: "true", we will just compare the generated hash as is.

Correct, if the annotation is there and is true, we would generate a hash and compare it later on. If the annotation is false or missing, then we do nothing (like what we have today)

@kahoulei
Copy link
Author

kahoulei commented Jun 4, 2024

If the annotation is false or missing, then we do nothing (like what we have today)

I thought we will generate hash if ignoreResourceUpdatesEnabled is true. So if the annotation is missing, we will generate the hash anyway? (for backward compatibility)

see https://github.com/argoproj/argo-cd/blob/master/controller/cache/cache.go#L507-L514

@kahoulei
Copy link
Author

kahoulei commented Jun 5, 2024

@agaudreault PTAL if you get a chance. Thanks!

Comment on lines 52 to 58
if k == AnnotationApplyResourcesUpdate {
value, err := strconv.ParseBool(v)
if err != nil {
value = false
}
res.applyResourcesUpdate = value
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code can be simplified by just checking this at the end of shouldHashManifest method.
If the shouldHashManifest return true because the annotations is present for an untracked resource, then it will take the same codepath as current behavior for tracked resources.

No need to update the ResourceInfo, store booleans or annotations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agaudreault updated with your suggestion.

One additional change in skipResourceUpdate: since we are not going to generate hash and therefore the logic of isSameManifest need to change. So PTAL.

Copy link
Member

@agaudreault agaudreault Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We simply cannot generate a hash by default or skip the resources updates for all the resources. When the annotation is not defined, the hash should not be generated. If you generate a hash by default for all resource updates, the controller will use a lot more CPU generating hashes than what you will save skipping reconcile.

I also think we should not go in a direction where non-argocd resources must have argocd annotations.

That is why I think the best approach is:

  • Generate the hash only if annotation is present and true.
  • (already done) Skip the resource updates if the hashes exist and the hashes match OR if the resource does not belong to any application

kahoulei added 12 commits June 6, 2024 12:31
…date

Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
@kahoulei kahoulei force-pushed the kahou/ignore-resources-tracking-16415 branch from 42fb0bb to 140c3d8 Compare June 6, 2024 19:31
Copy link

codecov bot commented Jun 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 47.65%. Comparing base (355ec75) to head (e4b3837).
Report is 75 commits behind head on master.

Current head e4b3837 differs from pull request most recent head bec8ffe

Please upload reports for the commit bec8ffe to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #18343      +/-   ##
==========================================
+ Coverage   45.01%   47.65%   +2.64%     
==========================================
  Files         354      332      -22     
  Lines       48018    46835    -1183     
==========================================
+ Hits        21613    22317     +704     
+ Misses      23597    21669    -1928     
- Partials     2808     2849      +41     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kahoulei
Copy link
Author

kahoulei commented Jun 7, 2024

@agaudreault I have added unit tests and they all passed. But the e2e tests are failing and looks like they are not related. Can you help me to take a look? (Sorry I am still not familiar with the e2e tests yet)

controller/cache/cache.go Outdated Show resolved Hide resolved
@@ -348,20 +352,37 @@ func skipResourceUpdate(oldInfo, newInfo *ResourceInfo) bool {
return false
}
isSameHealthStatus := (oldInfo.Health == nil && newInfo.Health == nil) || oldInfo.Health != nil && newInfo.Health != nil && oldInfo.Health.Status == newInfo.Health.Status
isSameManifest := oldInfo.manifestHash != "" && newInfo.manifestHash != "" && oldInfo.manifestHash == newInfo.manifestHash
isSameManifest := oldInfo.manifestHash == newInfo.manifestHash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition should not change. If a resource does not have a hash, that means the annotation was not there (or not true), therefore we should not ignore updates.

Copy link
Author

@kahoulei kahoulei Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I misunderstood your intention. So you want is the dependent object set argocd.argoproj.io/apply-resources-update=true and then use the existing ignore resources config to skip refresh. I will update the PR.

controller/cache/cache.go Outdated Show resolved Hide resolved
controller/cache/cache_test.go Outdated Show resolved Hide resolved
docs/operator-manual/reconcile.md Outdated Show resolved Hide resolved
docs/user-guide/annotations-and-labels.md Outdated Show resolved Hide resolved
kahoulei added 2 commits June 10, 2024 09:30
Signed-off-by: kahoulei <kahou.lei@okta.com>
Signed-off-by: kahoulei <kahou.lei@okta.com>
kahoulei and others added 2 commits June 10, 2024 10:36
@kahoulei kahoulei force-pushed the kahou/ignore-resources-tracking-16415 branch from 8f0225d to 0cc1287 Compare June 10, 2024 19:15
Signed-off-by: kahoulei <kahou.lei@okta.com>
kahoulei and others added 2 commits June 10, 2024 17:27
@kahoulei
Copy link
Author

@agaudreault can you take a look if anything else I need to change?

Copy link
Member

@agaudreault agaudreault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks good. Most comments are about terminology and user documentation to polish.

Comment on lines 74 to 76
// AnnotationApplyResourcesUpdate when set to true on a resource that is not tracked under an app, argocd will generate a
// hash and apply `ignoreResourceUpdate` configuration on it. If the annotation is set to false (or not presented) on a resource
// that is not tracked under an app, ignoreResourceUpdates configuration will not be applied.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaking implementation details on the API.

Suggested change
// AnnotationApplyResourcesUpdate when set to true on a resource that is not tracked under an app, argocd will generate a
// hash and apply `ignoreResourceUpdate` configuration on it. If the annotation is set to false (or not presented) on a resource
// that is not tracked under an app, ignoreResourceUpdates configuration will not be applied.
// AnnotationApplyResourcesUpdate when set to true on an untracked resource,
// argo will apply `ignoreResourceUpdates` configuration on it.

// AnnotationApplyResourcesUpdate when set to true on a resource that is not tracked under an app, argocd will generate a
// hash and apply `ignoreResourceUpdate` configuration on it. If the annotation is set to false (or not presented) on a resource
// that is not tracked under an app, ignoreResourceUpdates configuration will not be applied.
AnnotationApplyResourcesUpdate = "argocd.argoproj.io/apply-resources-update"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the code is written, I think it will be much easier for users to have the same terminology with the existing feature. wdyt?

Suggested change
AnnotationApplyResourcesUpdate = "argocd.argoproj.io/apply-resources-update"
AnnotationIgnoreResourceUpdates = "argocd.argoproj.io/ignore-resource-updates"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agaudreault I think AnnotationApplyResourcesUpdate is more logical to me. AnnotationIgnoreResourceUpdates seems the invert of what we are doing and therefore we need to flip our logic if we rename it.

// Best - Only hash for resources that are part of an app or their dependencies
// (current) - Only hash for resources that are part of an app + all apps that might be from an ApplicationSet
// Orphan - If orphan is enabled, hash should be made on all resource of that namespace and a config to disable it
// Worst - Hash all resources watched by Argo
return appName != "" || (gvk.Group == application.Group && gvk.Kind == application.ApplicationKind)
isTrackedResources := appName != "" || (gvk.Group == application.Group && gvk.Kind == application.ApplicationKind)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one resource

Suggested change
isTrackedResources := appName != "" || (gvk.Group == application.Group && gvk.Kind == application.ApplicationKind)
isTrackedResource := appName != "" || (gvk.Group == application.Group && gvk.Kind == application.ApplicationKind)

Comment on lines 114 to 121

## Tracking Dependent Resources

Dependent resources by default are not being tracked. Therefore, we cannot generate any hash of those objects and utilize
the `ignoreResourceUpdates` configuration.

If you want to track the dependent object and apply the `ignoreResourceUpdates` configuration, you can add
`argocd.argoproj.io/apply-resources-update=true` annotation in the dependent resources manifest:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the concepts of tracked resources and watched resources here are mixed up. Also the user facing documentation is leaking the implementation.

Suggested change
## Tracking Dependent Resources
Dependent resources by default are not being tracked. Therefore, we cannot generate any hash of those objects and utilize
the `ignoreResourceUpdates` configuration.
If you want to track the dependent object and apply the `ignoreResourceUpdates` configuration, you can add
`argocd.argoproj.io/apply-resources-update=true` annotation in the dependent resources manifest:
## Ignoring updates for untracked resources
ArgoCD will only apply `ignoreResourceUpdates` configuration to tracked resources of an application. This means dependant resources, such as a `ReplicaSet` and `Pod` created by a `Deployment`, will not ignore any updates and trigger a reconcile of the application for any changes.
If you want to apply the `ignoreResourceUpdates` configuration to an untracked resource, you can add the
`argocd.argoproj.io/ignore-resource-updates=true` annotation in the dependent resources manifest.

restartPolicy: OnFailure
```

Then you can update `argocd-cm` configMap to ignore the dependent resources:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Then you can update `argocd-cm` configMap to ignore the dependent resources:
The resource updates will be ignored based on your the `ignoreResourceUpdates` configuration in the `argocd-cm` configMap:

Comment on lines 168 to 169
Note: If you set `argocd.argoproj.io/apply-resources-update: "false"`, no hash will be generated and `ignoreResourceUpdates`
cannot be applied on those resources.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation details.

Suggested change
Note: If you set `argocd.argoproj.io/apply-resources-update: "false"`, no hash will be generated and `ignoreResourceUpdates`
cannot be applied on those resources.

Signed-off-by: kahoulei <kahou.lei@okta.com>
@kahoulei
Copy link
Author

@agaudreault I updated the doc. Thanks.

Signed-off-by: kahoulei <kahou.lei@okta.com>
@kahoulei kahoulei force-pushed the kahou/ignore-resources-tracking-16415 branch from 7c0577c to c29e0fb Compare June 18, 2024 03:21

// AnnotationIgnoreResourcesUpdate when set to true on an untracked resource,
// argo will apply `ignoreResourceUpdates` configuration on it.
AnnotationIgnoreResourcesUpdate = "argocd.argoproj.io/ignore-resources-update"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make the name consistent. Make sure to validate all references

Suggested change
AnnotationIgnoreResourcesUpdate = "argocd.argoproj.io/ignore-resources-update"
AnnotationIgnoreResourceUpdates = "argocd.argoproj.io/ignore-resource-updates"

Signed-off-by: kahoulei <kahou.lei@okta.com>
Copy link
Member

@agaudreault agaudreault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM. Waiting for a before/after graph on an instance with real load before merging

Signed-off-by: kahoulei <kahou.lei@okta.com>
@ronaknnathani
Copy link

Glad to see that this feature is being added. We have a case where we run large k8s clusters (~5k nodes) with tens of thousands of pods. Pod updates result in significant reconciliation activity and we are looking to leverage this feature.

A quick question about this change - if let's say updates on /status are ignored for an object like Pods, would the UI not show the latest Pod state in that case or would this only skip the Application reconciliation?

@kahoulei
Copy link
Author

A quick question about this change - if let's say updates on /status are ignored for an object like Pods, would the UI not show the latest Pod state in that case or would this only skip the Application reconciliation?

@ronaknnathani the behavior is same as ignoreResourceUpdates feature.

@kahoulei
Copy link
Author

Discussed with @agaudreault about load testing. I don't have environment to test it currently. If someone know how to load test it, please leave a comment here so that we can merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

Successfully merging this pull request may close these issues.

None yet

4 participants