Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize ApplicationSet Reconciliation Loop #10952

Closed
rumstead opened this issue Oct 14, 2022 · 30 comments · Fixed by #12714
Closed

Parallelize ApplicationSet Reconciliation Loop #10952

rumstead opened this issue Oct 14, 2022 · 30 comments · Fixed by #12714
Labels
component:applications-set Bulk application management related enhancement New feature or request type:scalability Issues related to scalability and performance related issues

Comments

@rumstead
Copy link
Contributor

rumstead commented Oct 14, 2022

Summary

The ApplicationSet controller looks to do a lot, if not all, tasks sequentially. This can cause slow ApplicationSet reconciliation when the controller is stuck waiting on slow tasks, like network IO from a GIT generator.

For instance,

time="2022-10-14T13:49:05Z" level=info msg=Trace args="[git fetch origin HEAD --tags --force]" dir=/tmp/https_super_secret_repo="exec git" time_ms=3122.340543

Motivation

Speeding up the reconciliation loop will allow for ApplicationSet changes to be applied faster providing a quicker feedback loop.

I have an app set controller managing about 100 apppsets, generating about 1800 applications. I am seeing large latency between a new application set being deployed and the controller picking them up. I do use the app of apps pattern for deploying the application sets. An example time line:

  1. pr merge for new app set
  2. About 2 minutes later Argo cd syncs appset via polling
  3. 14 minutes later the appset controller picks up the new appset

I am on version 2.4.14

Proposal

I haven't gotten this far yet but some sort of worker, consumer pattern :).

Another idea would be to integrate GIT interactions with the repo-server.

@rumstead rumstead added the enhancement New feature or request label Oct 14, 2022
@crenshaw-dev crenshaw-dev added type:scalability Issues related to scalability and performance related issues component:applications-set Bulk application management related labels Oct 14, 2022
@hcelaloner
Copy link
Contributor

Hi, we are also experiencing slowness in the ApplicationSet Controller side when the number of ApplicationSets increases. We opened #9002 and tried to bring a solution with #9568 for the subject. I am not fully familiar with the code base but like mentioned in the #9002 it seems that Application Controller has some parallelism and sharding support, meanwhile, the ApplicationSet controller does not have any parallelism and sharding support. We can bring the same options to ApplicationSet controller.

By the way, I tried to perform a small experiment about sharding. It may not be an ideal experiment but it can provide some insights.

In an ArgoCD instance in my company. I created ~807 ApplicationSets and ~1614 Applications (2 apps generated by each Application). When there is one pod running for ApplicationSet controller. I added the 808th ApplicationSet. It approximately took 30 minutes to create its apps. Hence, the 808th deployment took 30 minutes. In the meantime, ApplicationSet controller's "workqueue_depth" metric was always high.

Then tried to apply the sharding solution with a custom build. Deployed it with 9 replicas. I added the 809th ApplicationSet. It approximately took 1 minute. For each ApplicationSet controller instance, the "workqueue_depth" metric was zero for each replica.

@rumstead
Copy link
Contributor Author

Thanks for sharing @hcelaloner! Did you determine why your applicationset controller was taking so long to process the application sets? Ie, network IO?

@rumstead
Copy link
Contributor Author

@crenshaw-dev, should we look to make the git requests concurrent first? I am happy to submit a PR/work with @hcelaloner if we have an agreement on the next steps.

@crenshaw-dev
Copy link
Collaborator

@rumstead I guess it depends on how the git requests are made concurrent. Making them concurrent per-ApplicationSet would probably be relatively easy, but it would only be helpful for ApplicationSets which make multiple slow git requests.

Sharding is kind of on the other end of the spectrum, parallelizing at the highest level possible. It solves the problem, but at the cost of spinning up a lot of pods.

I feel like the best solution it going to be to figure out why the work queue depth is always so big. It looks to me as if the work queue is meant to be processable in parallel, and we simply aren't doing that now.

@jgwest since you know the history of this controller, do you know if parallelizing reconciliation was a TBD that just hasn't been done yet?

@rumstead
Copy link
Contributor Author

rumstead commented Oct 17, 2022

MaxConcurrentReconciles looks like it can be leveraged here.

https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/controller#Options

return ctrl.NewControllerManagedBy(mgr).

WithOptions(controller.Options{
	MaxConcurrentReconciles: <some # bigger than 1 :)>,
}).

@hcelaloner
Copy link
Contributor

Thanks for sharing @hcelaloner! Did you determine why your applicationset controller was taking so long to process the application sets? Ie, network IO?

No. I did not. Is there a way to investigate that one? I guess the applicationset controller has some metrics exposed by the controller runtime but it does not expose any metrics for git operations as far as I know. As discussed in the old issue, I guess an OpenTelemetry integration could be useful for observability.

MaxConcurrentReconciles looks like it can be leveraged here.

https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/controller#Options

return ctrl.NewControllerManagedBy(mgr).

WithOptions(controller.Options{
	MaxConcurrentReconciles: <some # bigger than 1 :)>,
}).

I am not too familiar with the internals of controller runtime. Is this option causes any concurrency problems? Like if we set it to 2, will two goroutines try to reconcile the same ApplicationSets, and will it create some sort of race conditions? Is there a code change needed to handle such a possibility?

If there is not such a race condition possibility, I guess we can just make it configurable via a flag. Somehow similar to "--operation-processors" or "--status-processors" options in the ArgoCD application controller. Basically, the users who need to increase the number of concurrent reconciles can tune it using the flag (or modify a configmap that holds the configurations for applicationset controller)

@hcelaloner
Copy link
Contributor

hcelaloner commented Oct 18, 2022

@rumstead I guess it depends on how the git requests are made concurrent. Making them concurrent per-ApplicationSet would probably be relatively easy, but it would only be helpful for ApplicationSets which make multiple slow git requests.

Sharding is kind of on the other end of the spectrum, parallelizing at the highest level possible. It solves the problem, but at the cost of spinning up a lot of pods.

I feel like the best solution it going to be to figure out why the work queue depth is always so big. It looks to me as if the work queue is meant to be processable in parallel, and we simply aren't doing that now.

@jgwest since you know the history of this controller, do you know if parallelizing reconciliation was a TBD that just hasn't been done yet?

As I said previously, I do not know the internals of controller runtime. Sorry, if I have the wrong guess. However, could the following owns statement cause unnecessary reconciles?

Owns(&argov1alpha1.Application{}).

Let me try to explain it with an example. Let's say that we have an ApplicationSet that generates an application. Deploying a new image to the application probably causes a change in its status since it stores its health status and operation history etc. Would that change in the application status trigger a reconcile operation for the owner ApplicationSet? If there is a such case, isn't it unnecessary?

@rumstead
Copy link
Contributor Author

Controllers should be idempotent. The ApplicationSet would drive any changes to the Applications and the Reconcile loop would be called to update the Applications with their new image, or specifically, their targetRevision. The Owns sets the ownerReference for managing the applications and yes updates to any owned Applications would trigger a Reconcile.

From my understanding, the queue that backs the Reconcile objects are thread-safe, and increasing that value wouldn't cause race conditions. Though that is my understanding after reading the not-so-great godocs additionally the code backs it up.

@jgwest
Copy link
Member

jgwest commented Oct 20, 2022

since you know the history of this controller, do you know if parallelizing reconciliation was a TBD that just hasn't been done yet?

I wasn't around for the early early days of the ApplicationSet controller, but I don't recall much discussion of parallelizing it (or otherwise sharding multiple instances of ApplicationSet), and when PR were merged I don't recall much discussion around ensuring that contributed generators were thread safe. When enabling multithread reconciliation, one needs to be careful in locking shared resources, and preventing race conditions, deadlocks, etc. For example, ensuring that different goroutines each running Reconcile() (and calling shared generator objects) do not step on each others toes.

@rumstead
Copy link
Contributor Author

When enabling multithread reconciliation, one needs to be careful in locking shared resources, and preventing race conditions, deadlocks, etc. For example, ensuring that different goroutines each running Reconcile() (and calling shared generator objects) do not step on each others toes.

The /tmp directory where the git directories check out is a prime example.

@rumstead
Copy link
Contributor Author

@hcelaloner did you have to do anything special to see the workqueue_depth metric? I am on 2.4.14 and do not see that metric when I query the endpoint. I do see it in the container runtime library.

# TYPE controller_runtime_active_workers gauge
# TYPE controller_runtime_max_concurrent_reconciles gauge
# TYPE controller_runtime_reconcile_errors_total counter
# TYPE controller_runtime_reconcile_time_seconds histogram
# TYPE controller_runtime_reconcile_total counter
# TYPE go_gc_duration_seconds summary
# TYPE go_goroutines gauge
# TYPE go_info gauge
# TYPE go_memstats_alloc_bytes gauge
# TYPE go_memstats_alloc_bytes_total counter
# TYPE go_memstats_buck_hash_sys_bytes gauge
# TYPE go_memstats_frees_total counter
# TYPE go_memstats_gc_cpu_fraction gauge
# TYPE go_memstats_gc_sys_bytes gauge
# TYPE go_memstats_heap_alloc_bytes gauge
# TYPE go_memstats_heap_idle_bytes gauge
# TYPE go_memstats_heap_inuse_bytes gauge
# TYPE go_memstats_heap_objects gauge
# TYPE go_memstats_heap_released_bytes gauge
# TYPE go_memstats_heap_sys_bytes gauge
# TYPE go_memstats_last_gc_time_seconds gauge
# TYPE go_memstats_lookups_total counter
# TYPE go_memstats_mallocs_total counter
# TYPE go_memstats_mcache_inuse_bytes gauge
# TYPE go_memstats_mcache_sys_bytes gauge
# TYPE go_memstats_mspan_inuse_bytes gauge
# TYPE go_memstats_mspan_sys_bytes gauge
# TYPE go_memstats_next_gc_bytes gauge
# TYPE go_memstats_other_sys_bytes gauge
# TYPE go_memstats_stack_inuse_bytes gauge
# TYPE go_memstats_stack_sys_bytes gauge
# TYPE go_memstats_sys_bytes gauge
# TYPE go_threads gauge
# TYPE process_cpu_seconds_total counter
# TYPE process_max_fds gauge
# TYPE process_open_fds gauge
# TYPE process_resident_memory_bytes gauge
# TYPE process_start_time_seconds gauge
# TYPE process_virtual_memory_bytes gauge
# TYPE process_virtual_memory_max_bytes gauge
# TYPE rest_client_requests_total counter

@hcelaloner
Copy link
Contributor

@hcelaloner did you have to do anything special to see the workqueue_depth metric? I am on 2.4.14 and do not see that metric when I query the endpoint. I do see it in the container runtime library.

# TYPE controller_runtime_active_workers gauge
# TYPE controller_runtime_max_concurrent_reconciles gauge
# TYPE controller_runtime_reconcile_errors_total counter
# TYPE controller_runtime_reconcile_time_seconds histogram
# TYPE controller_runtime_reconcile_total counter
# TYPE go_gc_duration_seconds summary
# TYPE go_goroutines gauge
# TYPE go_info gauge
# TYPE go_memstats_alloc_bytes gauge
# TYPE go_memstats_alloc_bytes_total counter
# TYPE go_memstats_buck_hash_sys_bytes gauge
# TYPE go_memstats_frees_total counter
# TYPE go_memstats_gc_cpu_fraction gauge
# TYPE go_memstats_gc_sys_bytes gauge
# TYPE go_memstats_heap_alloc_bytes gauge
# TYPE go_memstats_heap_idle_bytes gauge
# TYPE go_memstats_heap_inuse_bytes gauge
# TYPE go_memstats_heap_objects gauge
# TYPE go_memstats_heap_released_bytes gauge
# TYPE go_memstats_heap_sys_bytes gauge
# TYPE go_memstats_last_gc_time_seconds gauge
# TYPE go_memstats_lookups_total counter
# TYPE go_memstats_mallocs_total counter
# TYPE go_memstats_mcache_inuse_bytes gauge
# TYPE go_memstats_mcache_sys_bytes gauge
# TYPE go_memstats_mspan_inuse_bytes gauge
# TYPE go_memstats_mspan_sys_bytes gauge
# TYPE go_memstats_next_gc_bytes gauge
# TYPE go_memstats_other_sys_bytes gauge
# TYPE go_memstats_stack_inuse_bytes gauge
# TYPE go_memstats_stack_sys_bytes gauge
# TYPE go_memstats_sys_bytes gauge
# TYPE go_threads gauge
# TYPE process_cpu_seconds_total counter
# TYPE process_max_fds gauge
# TYPE process_open_fds gauge
# TYPE process_resident_memory_bytes gauge
# TYPE process_start_time_seconds gauge
# TYPE process_virtual_memory_bytes gauge
# TYPE process_virtual_memory_max_bytes gauge
# TYPE rest_client_requests_total counter

No, I did not do anything special. Just applied the following servicemonitor in our deployments and the metric was available in our Prometheus.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    release: prometheus-operator
  name: argocd-applicationset-controller-metrics
spec:
  endpoints:
  - port: metrics
  selector:
    matchLabels:
      app.kubernetes.io/name: argocd-applicationset-controller

@rumstead
Copy link
Contributor Author

rumstead commented Nov 2, 2022

@crenshaw-dev / @jgwest what are your alls thoughts on someone taking a stab at increasing MaxConcurrentReconciles and making sure the current generators are thread safe?

I still think the repo server should probably be used for git operations in the future but in the meantime /tmp/ can be locked. As well as identifying other shared states.

@crenshaw-dev
Copy link
Collaborator

@rumstead I'd be interested to know the results... the ApplicationSet controller is using a copy/pasta of old repo-server code, so it might have locking logic in place already.

@hcelaloner
Copy link
Contributor

Hi everyone,

I wanted to share something with you since it may be something to consider about this subject. In our company, we are testing the sharding approach. We deployed 10 pods of app set controller in one ArgoCD instance to speed up the appset processing process.

Yesterday, it resulted in an increase in requests sent to kube API server. Some master nodes become unavailable momentarily because of the load. Logs show that the requests targets secrets ("argocd.argoproj.io/secret-type=cluster") and source is the applicationset controller. I did not check the source code in details but I doubt that

clusterList, err := utils.ListClusters(ctx, r.KubeClientset, applicationSet.Namespace)
this line is the source of this requests.

This may be something to consider when increasing parallelism. I do not know it is something worry about or not. If it is, maybe applicationset controller can manage some sort of cache of clusters to reduce the number of requests made for clusters?

@rumstead
Copy link
Contributor Author

rumstead commented Dec 21, 2022

Git Generator

The ApplicationSet controller creates a git client here. The git client creates a temporary folder with directory under it with the name of the git repo URL as the name. It then does all git operations under that directory.

Multiple Reconcile

Wouldn't be safe to add multiple Reconciles if multiple ApplicationSets were targeting the same repo with different revisions. The remaining generators use APIs and seem to be stateless.

Repo Server observations

When the repo server creates a git client, it creates a directory using a "hashing strategy" but the hash is tied to the repo URL. However, the repo server uses the Lock struct to handle concurrent actions.

$ ls -lrt /tmp/_argocd-repo/
drwxr-xr-x  7 argocd argocd 4096 Dec 17 15:44 250eb2cf-2448-4306-ac34-53add308aa37

vs the ApplicationSet controller

$ ls -lrt /tmp
drwxr-xr-x 4 argocd argocd 4096 Dec 17 06:00 https___foo@ev.azure.com_repo_two
drwxr-xr-x 7 argocd argocd 4096 Dec 17 06:00 https___foo@dev.azure.com_repo_one

The repo server also obviously uses a Redis cache. That, more or less, shelters against intermittent, slow IO requests once it's cached.

Existing, semi-beneficial locking strategy

We could wrap the git interactions with the below. Though, idk how it would work for git interactions that return data, like LsFiles.

	closer, err := a.repoLock.Lock(gitRepoClient.Root(), revision, false, func() (io.Closer, error) {
		return util_io.NopCloser, checkoutRepo(gitRepoClient, revision, a.submoduleEnabled)
	})

New, semi-beneficial locking strategy

#11881

I feel like there are two high-level bodies of work.

  1. Make generators thread-safe
    • git generator is certainly not
  2. Enable caching on generators that have IO operations
    • targeting the git generator first as there are known patterns there

Would love your thoughts @crenshaw-dev

@rumstead
Copy link
Contributor Author

rumstead commented Feb 7, 2023

Continuing the investigation here, I found that we are being throttled by our Git provider which adds to the Git requests taking even more time. I think this points to a need in having Git-related generators cache.

@wanghong230
Copy link
Member

The proper solution is to move to use Repo Server instead of having another cache.
AppSets was started as an ecosystem project, so it did all the git on its own.

@rumstead
Copy link
Contributor Author

rumstead commented Feb 9, 2023

The proper solution is to move to use Repo Server instead of having another cache. AppSets was started as an ecosystem project, so it did all the git on its own.

Hard agree.

@juangb
Copy link

juangb commented Feb 24, 2023

Integrating the ApplicationSet Controller with the repo server would also provide better observability. The metric argocd_git_request_total would be used. Currently there is no visibility of the number of requests it is making to git.

@rumstead
Copy link
Contributor Author

rumstead commented Mar 3, 2023

I threw together a POC of having the applicationset controller use the repo server. I would want to add more caching and obviously need tests but wanted to see what folks thought.

#12714

EDIT: added some caching as well

@rumstead
Copy link
Contributor Author

The PR is ready for review. E2e tests exercise the applicationset controller using the repo server as a cache. Unit tests have been updated.

@rumstead
Copy link
Contributor Author

@ishitasequeira just wanted to see if you had any review capacity in the foreseeable future?

@ishitasequeira
Copy link
Member

@rumstead Apologies for the delay. I will give it a review today.

@naikn
Copy link

naikn commented Mar 27, 2023

I have a use case where I need to deploy over 10,000 applicationsets all of them using helm as a source. I faced some of the issues which was dicussed and is being fixed here in #12480. The reduntant reconciliation was causing great delays sometimes over 20 mins to deploy new applicationSet, sometimes even more depending on the type of generators being used. But with the fix, it does come down considerably. Thank you @rumstead

But still see that adding MaxConcurrentReconciles as option does help in improving the performance. I do understand that it causing issues currently to git generators with the locking of shared resources. But what about the case of other generators, does it still cause any locking issue? In my case adding the MaxConcurrentReconciles made it super quick.

@rumstead
Copy link
Contributor Author

rumstead commented Mar 27, 2023

I have a use case where I need to deploy over 10,000 applicationsets all of them using helm as a source. I faced some of the issues which was dicussed and is being fixed here in #12480. The reduntant reconciliation was causing great delays sometimes over 20 mins to deploy new applicationSet, sometimes even more depending on the type of generators being used. But with the fix, it does come down considerably. Thank you @rumstead

But still see that adding MaxConcurrentReconciles as option does help in improving the performance. I do understand that it causing issues currently to git generators with the locking of shared resources. But what about the case of other generators, does it still cause any locking issue? In my case adding the MaxConcurrentReconciles made it super quick.

I have been laser-focused on the git generator. However, I haven't seen anything in the other generators that would lead me to believe that concurrency on applicationset processing would cause any race conditions. I plan to increase the worker threads once this PR is merged as well.

IMO, my other PR which reduces the reconciles is going to have the largest impact on performance.

Out of curiosity, what value did you set MaxConcurrentReconciles to?

@naikn
Copy link

naikn commented Mar 27, 2023

@rumstead Absolutely agree that your PR for reducing the redundant reconciliation has improved its performance greately. I deployed a version argocd built out of your PR branch, and could see the deployment time went down from 20mins to like 2 mins.

My setup had about 200 clusters with 10000 appsets (targetting one cluster per appset) and 20 appsets deploying to all clusters and with around 10 deployments happening every few secs, also causing some forced failures on some apps causing it to go outofsync - triggering reconciles to appset controller.

So the idea was not keep it calm for appsets controller. Cause if its very calm, with your PR, appset was able to generate the apps in 1 sec, pretty good! But when its not calmer, there some reconciliations happening, for it to process everything sequentially meant I had to wait for my new deployment of appset to be picked up in the queue.

Which is where MaxConcurrentReconciles for me had to be varied based on the sitiuation. When it was less load on appsets, keeping it less than 5~10 was fine for me. WIth that I was able to get the apps generated in a matter of secs for every new appset.

But when I added/deleted a cluster, with all apps using cluster generator it was mayhem! As all the 10020 appsets got triggered!

Although increasing the MaxConcurrentReconciles did not improve the start up time of appset controller on a cluster with 10000 appsets.

@crenshaw-dev
Copy link
Collaborator

@naikn excuse the slightly off-topic question, but I'm really curious: what's the use case requiring 10k appsets? 😄

@naikn
Copy link

naikn commented Mar 27, 2023

Hi @crenshaw-dev we have lots and lots of applications and lots of clusters 😄. And the usecase - well, its mainly to use the generator to identify the clusters rather than setting server url in the argocd application. Most appset will generate one application and will be placed on a single cluster, but the app teams need not know where the cluster is hosted or what its URL is. AppSet's generator is a very good tool for this. Its easy to abstract the cluster placement from the app development teams.

@rumstead
Copy link
Contributor Author

rumstead commented Mar 28, 2023

But when I added/deleted a cluster, with all apps using cluster generator it was mayhem! As all the 10020 appsets got triggered!

Although increasing the MaxConcurrentReconciles did not improve the start up time of appset controller on a cluster with 10000 appsets.

@naikn yea I believe the cluster event handler requeues all appsets with a cluster generator. It sounds like in your case, is all of them.

alexmt pushed a commit that referenced this issue Apr 28, 2023
)

feat(appset): applicationset controller use repo server (#10952) (#12714)

Signed-off-by: rumstead <37445536+rumstead@users.noreply.github.com>
yyzxw pushed a commit to yyzxw/argo-cd that referenced this issue Aug 9, 2023
…) (argoproj#12714)

feat(appset): applicationset controller use repo server (argoproj#10952) (argoproj#12714)

Signed-off-by: rumstead <37445536+rumstead@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:applications-set Bulk application management related enhancement New feature or request type:scalability Issues related to scalability and performance related issues
Projects
None yet
8 participants