Support and document using HPA for repo-server #2559

alexmt · 2019-10-24T20:07:10Z

Summary

Provide and ability to automate repo-server auto-scaling using HPA.

Motivation

The repo server needs to be scaled up if argocd manages two many applications or a lot of applications are defined in the same repo. In both cases, manifest generation is taking too much time and apps reconciliation is slow.

Proposal

Add gauge Prometheus metric which represents the number of pending manifest requests.
Add sample HPA configuration which auto-scales repo-server if number of pending manifest requests is too high.

alexec · 2019-10-28T17:36:14Z

One more job for the #2468 ?

…e number of pending manifest requests.

… of pending manifest requests. (#2658)

maxbrunet · 2021-03-13T02:14:15Z

So I have used the gauge and came up with this using a KEDA ScaleObject:

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  name: argocd-repo-server
spec:
  scaleTargetRef:
    deploymentName: argocd-repo-server
  maxReplicaCount: 30
  minReplicaCount: 3
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-k8s.monitoring.svc.cluster.local:9090
      metricName: argocd_repo_pending_request_total
      query: avg(sum(argocd_repo_pending_request_total{namespace="argocd", job="argocd-repo-server"}) by (instance))
      threshold: '3'

But something bothers me, I think it is too late when scale up is triggered, all the requests are already in the repo-server replicas, added replicas will only be able to process subsequent requests, which may not happen for sometime, so we will scale back down, and scaling up was pointless.

Here is how sum(argocd_repo_pending_request_total) graphs for us:

It is mostly spikes of ~50-60 requests. Have you considered using a work queue? Maybe Redis could be used as a FIFO queue and to pass the manifests (the controller would read directly what is currently the cache)?

musabmasood · 2021-06-29T20:27:10Z

We're seeing these CPU spikes as well, something should definitely buffer this out. I haven't yet tried scaling up the repo server too much but with your comment it seems it won't really help that much.

maxbrunet · 2021-07-06T23:16:14Z

For now, we are using a Cron scaler for business hours:

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  name: argocd-repo-server
spec:
  scaleTargetRef:
    deploymentName: argocd-repo-server
  minReplicaCount: 3
  triggers:
  - type: cron
    metadata:
      timezone: America/Toronto
      start: 0 9 * * 1-5
      end: 0 18 * * 1-5
      desiredReplicas: "14"

On Slack, Alexander also pointed out another area of improvement: the repo-server should re-use the cloned Git repositories, right now it clones them on each request.

PatTheSilent · 2021-07-07T08:09:30Z

@maxbrunet I can imagine that could lead to problems if repo cleanups aren't properly implemented. Due to how repo-server currently works it's completely not a problem to use force-pushing, floating tags, custom plugins that may create new files or change existing ones during manifest generation (for example decrypting secrets encrypted with SOPS).

alexmt added the enhancement New feature or request label Oct 24, 2019

alexec mentioned this issue Oct 28, 2019

Add a Kubernetes operator for ArgoCD #2468

Closed

alexmt pushed a commit to alexmt/argo-cd that referenced this issue Nov 8, 2019

Issue argoproj#2559 - Add gauge Prometheus metric which represents th…

179fffe

…e number of pending manifest requests.

alexmt pushed a commit that referenced this issue Nov 8, 2019

Issue #2559 - Add gauge Prometheus metric which represents the number…

0ff2533

… of pending manifest requests. (#2658)

alexmt pushed a commit that referenced this issue Nov 8, 2019

Issue #2559 - Add gauge Prometheus metric which represents the number…

786a94b

… of pending manifest requests. (#2658)

jannfis added component:config-management Tools specific issues (helm, kustomize etc) type:supportability Enhancements that help operators to run Argo CD labels May 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support and document using HPA for repo-server #2559

Support and document using HPA for repo-server #2559

alexmt commented Oct 24, 2019

alexec commented Oct 28, 2019

maxbrunet commented Mar 13, 2021

musabmasood commented Jun 29, 2021

maxbrunet commented Jul 6, 2021

PatTheSilent commented Jul 7, 2021

Support and document using HPA for repo-server #2559

Support and document using HPA for repo-server #2559

Comments

alexmt commented Oct 24, 2019

Summary

Motivation

Proposal

alexec commented Oct 28, 2019

maxbrunet commented Mar 13, 2021

musabmasood commented Jun 29, 2021

maxbrunet commented Jul 6, 2021

PatTheSilent commented Jul 7, 2021