Same CRDs deployed by two ArgoCD apps causes loop and memory exhaustion #18417

maggie44 · 2024-05-26T12:45:59Z

Summary

I am deploying Prometheus and Loki both in separate ArgoCD applications (argoproj.io/v1alpha1), using Helm charts. Both use CRDs, and some of those CRDs are used by both. When I upgraded Prometheus, it upgraded the CRDs. The Loki deployment wasn't expecting these CRDs, and kept trying to return them back to the versions it wanted. Prometheus did the same, creating a loop, high CPU and memory usage. Most concerning, it appeared to be making many requests a second to the control plane API, and the control planes eventually also ran out of memory resulting in an outage.

For now I have applied skipCrds: true to one of them to resolve the issue. However, this is potentially problematic in the future as although they share some CRDs, some are potentially unique to each app and wouldn't be upgraded. This means I have to keep moving the skipCrds: true around between each every time I upgrade them.

Ideally there would be a feature that allows these CRDs shared by different apps to be reconciled without the loop. When loops do occur, there should be a limit on the number of reconciliation attempts, or a backoff to prevent the impact on the control planes.

The text was updated successfully, but these errors were encountered:

agaudreault · 2024-06-21T21:19:55Z

Hey @maggie44, you can take a look at the following feature: https://argo-cd.readthedocs.io/en/stable/user-guide/sync-options/#fail-the-sync-if-a-shared-resource-is-found

However, since you have 2 different applications managing the same CRDs, I would highly recommend you to create a third application that only deploys the CRDs. The same applies for any cluster scoped resources that would conflict between 2 applications. This way, you can manage the update lifecycle of the CRD independently from the 2 different controllers.

agaudreault · 2024-06-21T21:21:18Z

Closing as a mechanism to prevent the sync loop is already implemented.

agaudreault · 2024-06-25T17:01:20Z

Related to #18565

maggie44 added the enhancement New feature or request label May 26, 2024

agaudreault closed this as completed Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Same CRDs deployed by two ArgoCD apps causes loop and memory exhaustion #18417

Same CRDs deployed by two ArgoCD apps causes loop and memory exhaustion #18417

maggie44 commented May 26, 2024

agaudreault commented Jun 21, 2024

agaudreault commented Jun 21, 2024

agaudreault commented Jun 25, 2024

Same CRDs deployed by two ArgoCD apps causes loop and memory exhaustion #18417

Same CRDs deployed by two ArgoCD apps causes loop and memory exhaustion #18417

Comments

maggie44 commented May 26, 2024

Summary

agaudreault commented Jun 21, 2024

agaudreault commented Jun 21, 2024

agaudreault commented Jun 25, 2024