Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Synapse is hammering the database with user directory updates #7154

Closed
turt2live opened this issue Mar 26, 2020 · 8 comments
Closed

Synapse is hammering the database with user directory updates #7154

turt2live opened this issue Mar 26, 2020 · 8 comments

Comments

@turt2live
Copy link
Member

turt2live commented Mar 26, 2020

After upgrading from 1.11.0 to 1.12.0 the following is appearing quite a lot in the logs:

homeserver_1 - 2020-03-26 20:41:32,656 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1077 - [TXN OPERROR] {update_user_directory_stream_pos-27fe} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:32,659 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1077 - [TXN OPERROR] {update_user_directory_stream_pos-27fe} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:32,664 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1077 - [TXN OPERROR] {update_user_directory_stream_pos-27fe} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:47,025 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1149 - [TXN OPERROR] {update_user_directory_stream_pos-29ed} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:47,576 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1154 - [TXN OPERROR] {update_user_directory_stream_pos-2a33} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:47,579 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1154 - [TXN OPERROR] {update_user_directory_stream_pos-2a33} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:52,060 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1163 - [TXN OPERROR] {update_user_directory_stream_pos-2a81} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:52,078 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1163 - [TXN OPERROR] {update_user_directory_stream_pos-2a81} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:52,093 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1163 - [TXN OPERROR] {update_user_directory_stream_pos-2a81} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:52,122 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1163 - [TXN OPERROR] {update_user_directory_stream_pos-2a81} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:52,166 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1163 - [TXN OPERROR] {update_user_directory_stream_pos-2a81} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:52,187 - synapse.storage.database - 418 - WARNING - user_directory.notify_new_event-1163 - [TXN OPERROR] {update_user_directory_stream_pos-2a81} could not serialize access due to concurrent update
homeserver_1 - 2020-03-26 20:41:52,190 - synapse.metrics.background_process_metrics - 215 - ERROR - user_directory.notify_new_event-1163 - Background process 'user_directory.notify_new_event' threw an exception
  File "/home/synapse/.synapse-py3/lib/python3.6/site-packages/synapse/handlers/user_directory.py", line 109, in process
  File "/home/synapse/.synapse-py3/lib/python3.6/site-packages/synapse/handlers/user_directory.py", line 172, in _unsafe_process
    yield self.store.update_user_directory_stream_pos(max_pos)

The database is showing at least 5x the amount of rejected updates, which appear to be coming from all the workers. The user directory is disabled on this server. It was supposed to be disabled, but the wrong option was set. See comments.

@turt2live
Copy link
Member Author

Enabling the user directory has no effect on the frequency of the database requests.

@turt2live
Copy link
Member Author

In fact, it looks like disabling it has no effect either: the user directory is still being updated :/

@turt2live
Copy link
Member Author

ah, update_user_directory: false turned it off. Was using user_directory.enabled: false instead.

@deepbluev7
Copy link
Contributor

Yeah, a lot of the defaults were wrong for workers. In my case it also affected appservices, etc. This fixed it for me: #7133

Related: #7130

@airblag
Copy link

airblag commented Mar 27, 2020

ah, update_user_directory: false turned it off. Was using user_directory.enabled: false instead.

As far as I understand #7133 we should put update_user_directory: false in every worker config to avoid them to update the user directory in parallel, but the main synapse process would still take care of it ?

I had the impression it needs clarification since you also wrote that you disabled the user_directory globally.

@turt2live
Copy link
Member Author

You only need it in the main config, but it's a bit more severe than it looks on the surface. My server doesn't require the user directory because it's all bots and no users, but other public homeservers will need the user directory. Unless those servers also deploy their own worker, their other workers will start hammering the database.

If you do run a worker setup, it sounds like you'll also need a dedicated user directory worker now :(

@deepbluev7
Copy link
Contributor

I don't think you need a dedicated user directory worker. You can just disable the user directory stuff for every worker and only the master will update the db, which will reduce the db load. That's what #7133 does. It changes the default value for the workers, so master defaults the user directory and other stuff to on, while the workers default to off.

@turt2live
Copy link
Member Author

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants