-
Notifications
You must be signed in to change notification settings - Fork 123
Description
Members synchronization process is currently quite slow for our use case and I would like to improve this.
Expected Behavior
Synchronization to be fast as possible and preferably maximum wait time of 1h in worst case scenario.
Current Behavior
Currently synchronizing 100 000 members (60k add and 40k update) takes ~10 min.
Possible Solution
Running profiler for the sync process shows that majority of the sync time is spent executing database queries. The sync process seems to be creating 4 queries per member update and 7 queries per member for remove and add.
I pushed draft PR (#1184) where I have moved couple queries out of loops and doing some of the check in code and I also grouped some of the queries in transactions. Here are the benchmarks I ran with the changes:
add 20 000, remove 20 000 update 20 000 members
| Before | After | |
|---|---|---|
| Time (sec) | 263 | 67.2 |
| Calls to do_prepared_query | 240 032 | 120 032 |
add 60 000, update 40 000 members
| Before | After | |
|---|---|---|
| Time (sec) | 569 | 141 |
| Calls to do_prepared_query | 580 032 | 260 032 |
Context
We need to synchronize around 4 milj. memberships total a cross all lists and we would like to keep our list memberships as much as possible in sync with our data source and minimizing the wait time for sync.