-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load between controllers (argocd-application-controller) is not evenly distributed #6125
Comments
I think the load is distributed/sharded by cluster, so it might be more of an improvement than an actual bug |
Having an option to shard by something else than cluster would be much appreciated. Because I really dislike having single points of failure I have an ArgoCD stack in each of my clusters and due to this limitation I can only vertically scale the Application Controller, which is far from ideal. |
Use case: we deploy a number of QA/Staging/Preview environments for one product in a single cluster to save costs. In addition to having many more environments (ArgoCD Applications) these environments are much higher churn, being updated anywhere from a few times a day to many times per hour at peak. Ideally applications would be evenly distributed across shards to balance load so we aren't stuck over-allocating resources to underutilized shards. Update 8/12 - ArgoCD 2.1.0-rc3 reduced resource use significantly, but the issue of highly imbalanced shards remains. |
I have written a script parsing the output of the https://gist.github.com/maxbrunet/373374690b5064203e5e714de97d37fa The script currently works offline, but we could imagine integrating such logic in the application controller or build a dedicated service around it. The result would likely read the cluster stats from Redis directly. The algorithm is not perfect as I note in the caveats (see |
We have the same use case as well where we usually have 3 to 6k of ArgoCD applications in our staging environment. The fact that the sharding is on a per-cluster basis instead of per-app is not helping much because we deploy everything into the same cluster (in staging). |
We have the same issue with load, only one shard is working and during the high load it fails, but all others do not pick tasks. |
Same scenario here, we have a very heavy application deployed on a single cluster and adding another application controller replica does not distribute the load evenly. |
I have the same issue. |
Version ArgoCD 2.6.3 has still the same issue. |
How are you guys working around this ? |
A new feature in 2.8 attempts to mitigate the issue by letting folks decide their own cluster placement algorithms: #13018 There's also ongoing work on a dynamic, runtime rebalancing mechanism: #13221 Fact is, if you have 3 clusters over 3 shards, and one cluster is super active while the others are inactive, the imbalance is inevitable. There's no current effort to split the load of a single cluster across multiple shards. But if you're managing 30 clusters, and 3 of them are "hot," the new balancing algorithms might help place those 3 clusters on different shards so that overall load is more evenly placed. |
This is clearly stated but its sad news. Having one "hot" cluster very often overloads the single application-controller replica handling it, while the other replicas are idle. Scaling up resources for the single application-controller replica will also beef up the other replicas as each replica has the same resource request. It would be great to balance the load of a single cluster - other than dedicating a completely separate argocd installation for it / each "hot" cluster. |
There are efforts and features which may be able to "cool down" a cluster by cutting out unnecessary work. But splitting a single cluster across multiple shards will require deep knowledge of core Argo CD code and a proposal describing how that work could be effectively split. I expect that efforts to minimize unnecessary work on hot clusters will take priority, at least in the short- to medium-term. |
This could be done via hash-mod sharding on the cluster+app name string - this is similar to what NoSQL systems have been doing for well over a decade to figure out if they should manage a shard of data or not without needing negotiation with other nodes. It's pretty easy in terms of code too. Each application controller can generate this same hash map in 1 second (it doesn't require an expensive cryptographic algorithm) and knows its own zero-indexed ID and only manages the apps which have a matching modulus result from the hash mod on their cluster + app name. The hash map should be cached and only needs to be recalculated if the number of replicas is set differently, so a check once a minute of the modulus value used vs the replica count requested in the statefulset is a trivially cheap check. Notice that you may want to switch application controllers from being deployments to being statefulsets so that you get their zero-index ID for free, it's just the suffix of the pod name. |
Yes, assigning applications to different controllers is easy. Actually reducing controller workload in the process is hard. Much of the work on each controller has to do with maintaining a "cluster cache" for each managed cluster. Maintaining that cache has nothing (directly) to do with applications. By splitting applications targeting a single cluster across multiple controllers, you duplicate the work of maintaining that cluster cache across the controllers. So the problem isn't as easy as "spread the applications." It's "spread the applications in a way that significantly reduces each controller's workload." |
@crenshaw-dev ah that explains a lot, thanks. What is the cluster cache, a cached dump of all live objects on the cluster to compare application manifests to? Since most applications are in their own specific namespace, sharding by application could allow for the cluster cache to be optimized to only contain the namespaces for objects in those apps in most cases, therefore potentially reducing the cluster cache each application controller has to maintain and reconcile against? |
Yep, exactly!
I do like that idea.... dynamically establishing and dropping watches as the set of managed namespaces changes would require some overhead (both in terms of code and processing), but it would be possible. I think you still hit the problem of there being significant overlap and of "hot" namespaces - i.e. when one namespace accounts for 95% of the controller load. I think the time spent building this highly-dynamic system is probably better spent just designing and implementing an agent model, like Red Hat's or Akuity's. That lets you scale on a per-cluster basis by scaling the controller which is already dedicated to that cluster. |
If people are putting everything into the The For bespoke apps, people may put many applications into the same namespace out of laziness, but as they grow it's a simple doc to tell them not to do that for performance reasons. Yes having an agent-based approach might be easier to offload from the application controllers to each cluster. This would be similar to sharding at the cluster level at first. Perhaps then a second level of sharding over multiple agent pods within the cluster by sharding on app names within the agent replicas? |
Not always... it depends on how heterogeneous the kind of apps are. For example, I might have a simple "hello world" API in the Not saying this is common, just that it's not entirely uncommon or entirely the fault of the app designer.
Yep! You'd get agents tuned on a per-cluster basis, and then you should shard within that cluster using any of the existing sharding techniques or new ones. But it significantly limits the problem space, now that you're dealing with just one cluster. |
I think if you have a large number of apps then a random hash-mod sharding distribution within each cluster agent should on average level out a mix of large and small apps between different agent pods. Statistically the more apps the more the spread should even out due to the natural bell curve distribution, and since this scaling problem is going to be caused by more apps, this should be fine in practice. I guess we'll see when it's implemented! |
@lukaspj in my case the workload is quite distributed, but definitely not ideal yet...
|
Did you configure anything to achieve that or is it just out of the box? We're trying to figure out if we're hit by this or if it's because our architecture isn't optimal use of Argo (AppSet creating an App creating an App of Apps). |
@lukaspj, I might be wrong but, from memory, the last time we investigated this in my team, we found that the assignation of one cluster is done to one application-controller. If you have 2 application-controllers and 2 clusters with the same number/size/weight of apps, you will probably have no issues. If you have 2 application-controllers and multiple clusters with different number/size/weight of apps per cluster, then it's impossible to have them evenly distributed amongst the controllers. |
@lukaspj unfortunately @Linutux42 is right... Anyway I have 4 clusters, 4 app-controller replicas and the env-var |
Ah yeah, we are running on factories so we have to use ArgoCD in a pull-based fashion, so each cluster has its own ArgoCD instance that deploys to the local cluster. Which is a shame because we would like to distribute the workload a bit more, especially because of issues where a single node dies in a way where the pod responds as healthy but is unable to perform its workloads so failover to the standby might not happen as quickly as we'd like. Thanks for the input though! It's incredibly valuable for us |
I wonder if we could in theory load balance this by creating many cluster configs to the same local cluster. |
@rouke-broersma maybe such trick would work... if you test it, please let us know! |
Just did a quick test, and it seems that argocd is using the cluster URL as a unique attribute. at least when i created a new cluster using the URL as the in-cluster config, it just merged them. To make something like this work, it would need atleast a few more services pointing towards the api-server, or maybe using a cluster external URL/IP to connect to. |
from 2.8.0 and later releases we can the Round-Robin sharding algorithm How to configure the Argo CD Application Controller to use a round-robin sharding algorithm?
After updating the configmap successfully, roll out the restart of the Argo CD Application Controller statefulset using the following command:
Now, to verify that the Argo CD Application Controller is using a round-robin sharding algorithm, run the following command:
|
Describe the bug
I have a ArgoCD High Availability setup where I have also scaled the number of replicas in
argocd-application-controller
as shown in the documentation.To Reproduce
argocd-application-controller
as belowapiVersion: apps/v1 kind: StatefulSet metadata: name: argocd-application-controller spec: replicas: 3 template: spec: containers: - name: argocd-application-controller env: - name: ARGOCD_CONTROLLER_REPLICAS value: "3"
Expected behavior
I was expecting the controller to distribute the load to all three controllers but only one took up all the load, rest two are sitting idle.
Screenshots
All pods running in HA mode
Screenshot of the pods resources
![2021-04-29_15-46-09](https://user-images.githubusercontent.com/6998650/116521943-f124e100-a917-11eb-9b2a-ec44452dca03.png)
Version
The text was updated successfully, but these errors were encountered: