Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resources with tracking method annotation incorrectly marked as out-of-sync/requires pruning #8683

Closed
3 tasks
jannfis opened this issue Mar 4, 2022 · 22 comments · Fixed by #9791 or #10198
Closed
3 tasks
Labels
bug Something isn't working cherry-pick/2.4 Candidate for cherry picking into the 2.4 release branch
Milestone

Comments

@jannfis
Copy link
Member

jannfis commented Mar 4, 2022

Checklist:

  • I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • I've included steps to reproduce the bug.
  • I've pasted the output of argocd version.

Describe the bug

One of the goals of using annotations for resource tracking is, that the resource should only be considered managed if the values of the annotation match the properties of the resource. However, this does not seem to be the case.

To Reproduce

  1. Set application.resourceTrackingMethod to annotation
  2. Create a new application, e.g. kustomize-guestbook from https://github.com/argoproj/argocd-example-apps, with a target namespace of guestbook
  3. Sync the application. Make sure that Argo CD has put on tracking annotations on the live resources instead of the label, e.g.
  annotations:
    argocd.argoproj.io/tracking-id: 'guestbook:/Service:guestbook/kustomize-guestbook-ui'
  1. Create a new dummy resource in namespace guestbook, for example a ConfigMap:
kubectl -n guestbook create configmap foobar
  1. Add annotations from another resource to dummy resource:
kubectl -n guestbook annotate configmap foobar argocd.argoproj.io/tracking-id='guestbook:/Service:guestbook/kustomize-guestbook-ui'
  1. Observe this resource being marked as out of sync and pruning required in the application:
    image

Expected behavior

The resource should be ignored because the properties of the resource don't match the values in the annotation. The resource should not be considered to be managed by Argo CD, and neither marked out of sync nor for be marked for pruning.

Screenshots

Version

Latest HEAD as of the time of issue creation:

argocd-server: v2.3.0+f9b7384
  BuildDate: 2022-03-04T07:39:01Z
  GitCommit: f9b7384bf4d5b0197e726ca19e40518c97fd05fd
  GitTreeState: clean
  GoVersion: go1.17.6
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v4.4.1 2021-11-11T23:36:27Z
  Helm Version: v3.8.0+gd141386
  Kubectl Version: v0.23.1
  Jsonnet Version: v0.18.0

Logs

Paste any relevant application logs here.
@jannfis jannfis added the bug Something isn't working label Mar 4, 2022
@jessesuen
Copy link
Member

The resource should be ignored because the properties of the resource don't match the values in the annotation. The resource should not be considered to be managed by Argo CD, and neither marked out of sync nor for be marked for pruning.

When we introduced this feature, we intended it to handle the case where some parent resource created a child resource but:

  1. did not set owner reference on the child resource
  2. carried over labels/annotations (including our tracking label) to child resource

I think there might be two cases we need to consider:

  1. if the tracking reference on the child is referencing a resource which is still being managed by Argo CD (in your example, it is Service:guestbook/kustomize-guestbook-ui), then I agree that the resource should not be considered OutOfSync or needing pruning and I agree with this bug.

  2. However, if the tracking reference is referencing a resource that no longer exists, then I think it should be considered OutOfSync and required for pruning. This would handle the problem where the user deletes the parent object and is now left with an orphaned child object which they should see in the UI.

I think to fix this, we need to create an artificial parent-child relationship using the tracking-id reference, similar to what we do with Service->Endpoints, PVC->PV. Then in your example, the foobar ConfigMap would be considered a child of the Service:guestbook/kustomize-guestbook-ui.

@jessesuen
Copy link
Member

BTW, correct me if I'm wrong, but I think the annotation tracking method is no worse than the instance label method. But now that we have sufficient metadata in the annotation, we actually can do better about creating artificial parent/child relationships than the instance label.

In fact, I thought that was the entire point about incorporating the kind/namespace/name into the annotation.

@jannfis
Copy link
Member Author

jannfis commented Mar 4, 2022

When we introduced this feature, we intended it to handle the case where some parent resource created a child resource but:

  1. did not set owner reference on the child resource
  2. carried over labels/annotations (including our tracking label) to child resource

This is actually what the reproduction steps do (e.g. a new resource that does not have ownerReferences and carries copy of another resource's annotations).

@jannfis
Copy link
Member Author

jannfis commented Mar 4, 2022

if the tracking reference on the child is referencing a resource which is still being managed by Argo CD (in your example, it is Service:guestbook/kustomize-guestbook-ui), then I agree that the resource should not be considered OutOfSync or needing pruning and I agree with this bug.

Yes, this is the case.

However, if the tracking reference is referencing a resource that no longer exists, then I think it should be considered OutOfSync and required for pruning. This would handle the problem where the user deletes the parent object and is now left with an orphaned child object which they should see in the UI.

I'm not sure whether we should do that, because it could lead to some kind of loop hole. E.g. user with permissions to update a certain resource (e.g. to add some annotations) could now implicitly delete this resource through Argo CD.

@jessesuen
Copy link
Member

jessesuen commented Mar 4, 2022

I'm not sure whether we should do that, because it could lead to some kind of loop hole.

I share the same concern about the loophole. The same loophole would potentially already exist with our instance label tracking.

However, I seem to recall that we have some protection mechanism that excludes resources that are impossible to be part of the application (e.g. in a namespace outside of the project boundaries)

@jessesuen
Copy link
Member

jessesuen commented Apr 28, 2022

So this issue comes down to the following question: What should we do when we encounter a resource with our annotation-based tracking label, but that tracking label is not self-referential (it's pointing to some parent resource instead of itself)?

Here are some options/proposals.

  1. Ignore the resource completely. Don't even show it. This is my least favorite option
  2. Present the resource as a child of the parent resource (assuming the parent still exists and is part of the app). This is the same approach we do with Service -> Endpoints. Endpoints are not technically owned by Services, but functionally they are children of them
  3. Present the resource like a top-level managed object, but as an extraneous object which should not be pruned. We already have two annotations that replicate this behavior so implementation-wise, it shouldn't be hard to do.
metadata:
  annotations:
    argocd.argoproj.io/compare-options: IgnoreExtraneous
    argocd.argoproj.io/sync-options: Prune=false

In general, I do not think we should be considering objects to be pruned, if it has our tracking annotation and the value is not self-referential. Because it is pretty obvious that the resource was automatically created, and not a managed resource from git.

@jannfis
Copy link
Member Author

jannfis commented Jun 5, 2022

Thanks for your insight, @jessesuen.

I'm also not in favor of option 1 and think the object should be somehow displayed to the user.

The major use case for me is when an Operator creates resources and copies all labels and annotations from its Operand onto the resources it creates. So if you manage the Operand via Argo CD, and the Operator creates a ConfigMap during reconciliation, that ConfigMap will carry the same labels and annotations as the Operand, including Argo CD's tracking labels.

Now, not all Operators create ownerReferences on the resources they create, and that's where the original problem stems from. In some scenarios, the Operator even isn't able to create ownerReferences on such resources, for example when Operand and the resource are in different namespaces, or when one of Operand or resource are cluster scoped, but the other is not.

So I'm in favor of a combination of option 2 and 3 above:

  1. If the tracking label on a resource is not self-referencing, but instead references another resource in the same application, we create a parent/child relationship between the two.

  2. If the tracking label on a resource is not self-referencing and refers to an otherwise unknown resource (e.g. a resource not managed by current application, or a resource outside the application's scope), we display it as top-level resource without owner.

I would also be fine to first go for the simple approach (e.g. displaying it as top-level resource) and if there's demand, we can see how to establish parent/child relationship from the UI's point of view.

WDYT?

@jessesuen
Copy link
Member

I would also be fine to first go for the simple approach (e.g. displaying it as top-level resource) and if there's demand, we can see how to establish parent/child relationship from the UI's point of view.

I agree. I think showing it as a top-level resource is probably the most accurate and least surprising behavior. Option 2 (showing it as a child) might give a false impression that there is an ownership reference when there really is not. So my preference is for option 3 (top-level extraneous resource that will not be pruned or diffed).

Perhaps in the future, we might have a UI option/toggle to delineate between resources that are true owners of child objects, and ones that are pseudo-children by virtue of labels/annotations carried over to offshoots.

Sounds like we have a consensus about how to fix the issue. This bug/feature has become important to us since it affects our Crossplane UX, so I will prioritize fixing this for the next Argo CD release unless someone gets to it first.

@voidengineer
Copy link

Maybe those "pseudo-children" could be displayed slightly different? Dotted borders, another background color, anything like that?

@jannfis
Copy link
Member Author

jannfis commented Jun 27, 2022

Fix for this issue: #9791

@ralf-cestusio
Copy link

I have a question.
I am on 2.4.8 so the fix for this should be in
i have the following 2 objects (the network Claim "owns" the XNetwork) and from my understanding on how the tracking-id works XNetwork should not be considered for Pruning.
But it does get pruned on synch.
Is there something i misunderstand?

kind: NetworkClaim
metadata:
  annotations:
    argocd.argoproj.io/tracking-id: crossplane-sample:aws.ops.umanaia.org/NetworkClaim:ralf-test/sample-network
  name: sample-network
  namespace: ralf-test



kind: XNetwork
metadata:
  annotations:
    argocd.argoproj.io/tracking-id: crossplane-sample:aws.ops.umanaia.org/NetworkClaim:ralf-test/sample-network
  name: sample-network-lsq6d

@jessesuen
Copy link
Member

The fix doesn't seem to work in v2.4.8. Here is a repo which can reproduce the issue:
https://github.com/jessesuen/bug-self-ref-pruning

@jessesuen jessesuen reopened this Aug 4, 2022
@jessesuen
Copy link
Member

Video proof:

Screen.Recording.2022-08-03.at.5.43.43.PM.mov

@jessesuen jessesuen added this to the v2.5 milestone Aug 4, 2022
@jessesuen jessesuen added the cherry-pick/2.4 Candidate for cherry picking into the 2.4 release branch label Aug 4, 2022
@jannfis
Copy link
Member Author

jannfis commented Aug 4, 2022

I have found the bug and fixed it.

Will write some e2e test around it and then submit a PR.

@jannfis
Copy link
Member Author

jannfis commented Aug 4, 2022

In order to fix this, first argoproj/gitops-engine#436 should be merged. Then, I'll update #10198 to include new changes from gitops-engine.

After that, we can cherry-pick both changes into their appropriate release branches.

@jannfis
Copy link
Member Author

jannfis commented Aug 10, 2022

@ralf-cestusio Just as a heads-up, the upcoming v2.4.9 release will contain a proper fix for the pruning bug.

@ralf-cestusio
Copy link

Yes i saw the fix. Thank you for this really quick resolution!

@QuingKhaos
Copy link
Contributor

QuingKhaos commented Aug 12, 2022

I would also be fine to first go for the simple approach (e.g. displaying it as top-level resource) and if there's demand, we can see how to establish parent/child relationship from the UI's point of view.

I agree. I think showing it as a top-level resource is probably the most accurate and least surprising behavior. Option 2 (showing it as a child) might give a false impression that there is an ownership reference when there really is not. So my preference is for option 3 (top-level extraneous resource that will not be pruned or diffed).

From an operator and user viewpoint, this is more confusing. We would have preferred the non-selfreferencing resource is shown as pseudo-child. When we see a top-level resource we expect that this resource is defined in the application source, but the copied resource is basically not.

A pseudo-child relationship would have made it more clear, e.g. from which secret a copied secret originates. Which would be similar to the wanted behavior of #5082, so parent-child relationships are not strictly ownership references anymore.

@jannfis
Copy link
Member Author

jannfis commented Aug 12, 2022

When we see a top-level resource we expect that this resource is defined in the application source, but the copied resource is basically not.

The distinction can be made from the sync status - a resource not existing in Git does not have a sync status, and the corresponding icon is also not displayed. I agree that rendering a parent/child relationship would be better, though. I think that we can use the resource that is fully qualified in the annotation to map to the "parent", but I need to think this through for a while.

@QuingKhaos
Copy link
Contributor

QuingKhaos commented Aug 13, 2022

As an external operator which supports whole corporate teams with lesser experience, the distinction of the sync status on a top-level resource is not so obvious for most of the teams. They wouldn't expect a top-level resource to be non-synced resource at first look, and a small icon missing is not very clear from UX/accessibility standpoint vs. a clear parent-child relationship.

A clear pseudo parent-child relationship will not raise the "what is wrong with my deployment, where does this resource come from, what do I need to with this, how do I handle this" questions I got. While a parent-child relationship is what it is: a resource they to not care about it explicitly.

Maybe top-level non-synced resource vs. pseudo parent-child relationship can be configured as feature flag via argocd-cm config map or as an container argument/environment variable on the corresponding component. This would still support the use-case (by default), where operators would expext only real ownership references as parent-child relationships and can be revisited when the feature of configuring parent-child relationship by label selectors gets implemented.

@jannfis
Copy link
Member Author

jannfis commented Aug 13, 2022

Please feel free to open an enhancement request for this.

@leoluz
Copy link
Collaborator

leoluz commented Oct 21, 2022

@jannfis We found an issue with this new logic in our servers when users update to newer APIGroups (going from extensions/Ingress to networking/Ingress for example).

I described the issue with more details in my PR with a fix attempt (#11012)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cherry-pick/2.4 Candidate for cherry picking into the 2.4 release branch
Projects
None yet
6 participants