Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

621 - WARNING - persist_presence_changes-99641 - [TXN OPERROR] {update_presence-34728b} could not serialize access due to concurrent update #16734

Open
lea-aglz opened this issue Dec 7, 2023 · 1 comment
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Logging Synapse's logs (structured or otherwise). Not metrics. A-Presence O-Occasional Affects or can be seen by some users regularly or most users rarely S-Tolerable Minor significance, cosmetic issues, low or no impact to users.

Comments

@lea-aglz
Copy link

lea-aglz commented Dec 7, 2023

Description

The are some Warnings on Synapse 1.89

Steps to reproduce

I have this error on the log.

Homeserver

auto-host

Synapse Version

1.89

Installation Method

pip (from PyPI)

Database

PosgreeSQL 11

Workers

Single process

Platform

CentOS7 VM Python installation

Configuration

No response

Relevant log output

2023-11-23 11:42:50.459	cd11-comm.on-gofast.com
2023-11-23 12:42:40,592 - synapse.storage.txn - 621 - WARNING - persist_presence_changes-99641 - [TXN OPERROR] {update_presence-34728b} could not serialize access due to concurrent update

Anything else that would be useful to know?

No response

@DMRobertson
Copy link
Contributor

I think this is a specific instance of #4993.

This message is printed whenever we get an "operational error", in which case we rollback the transaction:

except self.engine.module.OperationalError as e:
# This can happen if the database disappears mid
# transaction.
transaction_logger.warning(
"[TXN OPERROR] {%s} %s %d/%d",
name,
e,
i,
N,
)
if i < N:
i += 1
try:
with opentracing.start_active_span("db.rollback"):
conn.rollback()
except self.engine.module.Error as e1:
transaction_logger.warning("[TXN EROLL] {%s} %s", name, e1)
continue
raise

But the transaction will be retried (it lives within a loop that retries up to 5 times). This means that most of the time, these errors about concurrent access correct themsleves.

The specific example you highlight is

async with stream_ordering_manager as stream_orderings:
# Run the interaction with an isolation level of READ_COMMITTED to avoid
# serialization errors(and rollbacks) in the database. This way it will
# ignore new rows during the DELETE, but will pick them up the next time
# this is run. Currently, that is between 5-60 seconds.
await self.db_pool.runInteraction(
"update_presence",
self._update_presence_txn,
stream_orderings,
presence_states,
isolation_level=IsolationLevel.READ_COMMITTED,
)

The comment and choice of isolation level seems to come from #15826, but it looks like it doesn't anticipate a concurrent update.

@DMRobertson DMRobertson added A-Presence A-Logging Synapse's logs (structured or otherwise). Not metrics. S-Tolerable Minor significance, cosmetic issues, low or no impact to users. A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db O-Occasional Affects or can be seen by some users regularly or most users rarely labels Dec 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Logging Synapse's logs (structured or otherwise). Not metrics. A-Presence O-Occasional Affects or can be seen by some users regularly or most users rarely S-Tolerable Minor significance, cosmetic issues, low or no impact to users.
Projects
None yet
Development

No branches or pull requests

2 participants