Hi,
In short, the bug happens when operator manager VMAgent runs as StatefulSet and has HPA enabled (recent feature). When running as a Deployment everything works correctly. Operator know not to sync replica count and just let HPA do it's job. However, when running as a StatefulSet, there is a bug where both operator and HPA try to control replica count. This results in pods being created and deleted constantly and HPA not being able to scale with workload.
Operator version: 0.62.1 (Helm chart version), v0.69.0 (app version)
VMAgent image version: v1.136.0
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMAgent
metadata:
name: dummy
spec:
replicaCount: 1
hpa:
minReplicas: 2
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
statefulMode: true
Operator logs:
{"level":"info","ts":"2026-05-19T10:22:24Z","logger":"controller.VMAgent","msg":"updating Statefulset name=vmcluster/vmagent-dummy, is_prev_nil=false","vmagent":"dummy","namespace":"vmcluster","spec_diff":{"spec.replicas":{"--":2,"++":1}}}
{"level":"info","ts":"2026-05-19T10:22:39Z","logger":"controller.VMAgent","msg":"updating Statefulset name=vmcluster/vmagent-dummy, is_prev_nil=false","vmagent":"dummy","namespace":"vmcluster","spec_diff":{"spec.replicas":{"--":2,"++":1}}}
{"level":"info","ts":"2026-05-19T10:22:54Z","logger":"controller.VMAgent","msg":"updating Statefulset name=vmcluster/vmagent-dummy, is_prev_nil=false","vmagent":"dummy","namespace":"vmcluster","spec_diff":{"spec.replicas":{"--":2,"++":1}}}
HPA events:
Normal SuccessfulRescale 2m53s (x27 over 19m) horizontal-pod-autoscaler New size: 2; reason: Current number of replicas below Spec.MinReplicas
In addition to fixing the bug, would it be possible to add option to disable replicaCount sync by the operator.
It looks like the fix is to have this code look something like this code.
Hi,
In short, the bug happens when operator manager VMAgent runs as StatefulSet and has HPA enabled (recent feature). When running as a Deployment everything works correctly. Operator know not to sync replica count and just let HPA do it's job. However, when running as a StatefulSet, there is a bug where both operator and HPA try to control replica count. This results in pods being created and deleted constantly and HPA not being able to scale with workload.
Operator version: 0.62.1 (Helm chart version), v0.69.0 (app version)
VMAgent image version: v1.136.0
Operator logs:
{"level":"info","ts":"2026-05-19T10:22:24Z","logger":"controller.VMAgent","msg":"updating Statefulset name=vmcluster/vmagent-dummy, is_prev_nil=false","vmagent":"dummy","namespace":"vmcluster","spec_diff":{"spec.replicas":{"--":2,"++":1}}} {"level":"info","ts":"2026-05-19T10:22:39Z","logger":"controller.VMAgent","msg":"updating Statefulset name=vmcluster/vmagent-dummy, is_prev_nil=false","vmagent":"dummy","namespace":"vmcluster","spec_diff":{"spec.replicas":{"--":2,"++":1}}} {"level":"info","ts":"2026-05-19T10:22:54Z","logger":"controller.VMAgent","msg":"updating Statefulset name=vmcluster/vmagent-dummy, is_prev_nil=false","vmagent":"dummy","namespace":"vmcluster","spec_diff":{"spec.replicas":{"--":2,"++":1}}}HPA events:
In addition to fixing the bug, would it be possible to add option to disable replicaCount sync by the operator.
It looks like the fix is to have this code look something like this code.