Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same CRDs deployed by two ArgoCD apps causes loop and memory exhaustion #18417

Closed
maggie44 opened this issue May 26, 2024 · 3 comments
Closed
Labels
enhancement New feature or request

Comments

@maggie44
Copy link

Summary

I am deploying Prometheus and Loki both in separate ArgoCD applications (argoproj.io/v1alpha1), using Helm charts. Both use CRDs, and some of those CRDs are used by both. When I upgraded Prometheus, it upgraded the CRDs. The Loki deployment wasn't expecting these CRDs, and kept trying to return them back to the versions it wanted. Prometheus did the same, creating a loop, high CPU and memory usage. Most concerning, it appeared to be making many requests a second to the control plane API, and the control planes eventually also ran out of memory resulting in an outage.

For now I have applied skipCrds: true to one of them to resolve the issue. However, this is potentially problematic in the future as although they share some CRDs, some are potentially unique to each app and wouldn't be upgraded. This means I have to keep moving the skipCrds: true around between each every time I upgrade them.

Ideally there would be a feature that allows these CRDs shared by different apps to be reconciled without the loop. When loops do occur, there should be a limit on the number of reconciliation attempts, or a backoff to prevent the impact on the control planes.

@maggie44 maggie44 added the enhancement New feature or request label May 26, 2024
@agaudreault
Copy link
Member

Hey @maggie44, you can take a look at the following feature: https://argo-cd.readthedocs.io/en/stable/user-guide/sync-options/#fail-the-sync-if-a-shared-resource-is-found

However, since you have 2 different applications managing the same CRDs, I would highly recommend you to create a third application that only deploys the CRDs. The same applies for any cluster scoped resources that would conflict between 2 applications. This way, you can manage the update lifecycle of the CRD independently from the 2 different controllers.

@agaudreault
Copy link
Member

Closing as a mechanism to prevent the sync loop is already implemented.

@agaudreault
Copy link
Member

Related to #18565

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants