You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a project and/or application is deleted, it is not deleted from Prometheus metrics and stays in cache until the ArgoCD application controller's next rolling-update.
Motivation
We use ArgoCD in our QA environment with temporary projects and associated applications.
Each time we create a custom branch on a repository, it creates the resources in ArgoCD to bootstrap a dedicated QA environment. However, when the ArgoCD resources are deleted, they are kept in Prometheus metrics.
We have about 2000 to 4000 applications, and we delete/create about 500 times every day.
Prometheus endpoint /metrics sometimes timeout after 10 seconds due to the huge amount of metrics. It also put pressure on metrics retention.
Proposal
I see three ways of addressing this:
from time to time within the app, resetting metrics with the Prometheus client
use a specific API endpoint to allow resetting the Prometheus metrics
add support on app/project deletion to also remove the associated metrics
There are probably other solutions as well. In the meantime, we did something very naive to mitigate by scheduling a cronjob to delete the argocd-application-controller pod every night.
The text was updated successfully, but these errors were encountered:
from time to time within the app, resetting metrics with the Prometheus client
Spoke with @alexmt and of all the proposals, Option 1 seems the best way. Option 2 seems like unnecessary integration effort and Option 3 will be unreliable because it's easy to miss delete events of applications, and you end up having to implement some form of option 1 anyways.
@victorboissiere would you like to contribute this change? We may not get around to this for v1.9.
@jessesuen thanks for the feedback. I saw that a cron package is already used for the sync window.
I'll try to reuse the same to reset Prometheus metrics every 24 hours. I'll submit the PR following the guide in the documentation.
Summary
When a project and/or application is deleted, it is not deleted from Prometheus metrics and stays in cache until the ArgoCD application controller's next rolling-update.
Motivation
We use ArgoCD in our QA environment with temporary projects and associated applications.
Each time we create a custom branch on a repository, it creates the resources in ArgoCD to bootstrap a dedicated QA environment. However, when the ArgoCD resources are deleted, they are kept in Prometheus metrics.
We have about 2000 to 4000 applications, and we delete/create about 500 times every day.
Prometheus endpoint
/metrics
sometimes timeout after 10 seconds due to the huge amount of metrics. It also put pressure on metrics retention.Proposal
I see three ways of addressing this:
There are probably other solutions as well. In the meantime, we did something very naive to mitigate by scheduling a cronjob to delete the argocd-application-controller pod every night.
The text was updated successfully, but these errors were encountered: