Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

background_drop_invalid_event_edges_rows failed #16193

Open
tusooa opened this issue Aug 28, 2023 · 5 comments
Open

background_drop_invalid_event_edges_rows failed #16193

tusooa opened this issue Aug 28, 2023 · 5 comments
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter

Comments

@tusooa
Copy link

tusooa commented Aug 28, 2023

Description

Background update to drop invalid event edge rows failed.

Steps to reproduce

  • Start synapse

Homeserver

tusooa.xyz

Synapse Version

1.90.0

Installation Method

Docker (matrixdotorg/synapse)

Database

postgresql 13. single server. yes, used portdb. yes, once restored.

Workers

Multiple workers

Platform

Ubuntu 22.04, Kubernetes (4-node kubeadm cluster)

Configuration

No response

Relevant log output

2023-08-27 23:49:32,646 - synapse.storage.background_updates - 302 - ERROR - background_updates-0 - Error doing update
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/background_updates.py", line 294, in run_background_updates
    result = await self.do_next_background_update(sleep)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/background_updates.py", line 424, in do_next_background_update
    await self._do_background_update(desired_duration_ms)
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/background_updates.py", line 467, in _do_background_update
    items_updated = await update_handler(progress, batch_size)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/events_bg_updates.py", line 1408, in _background_drop_invalid_event_edges_rows
    done = await self.db_pool.runInteraction(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 924, in runInteraction
    return await delay_cancellation(_runInteraction())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/internet/defer.py", line 1693, in _inlineCallbacks
    result = context.run(
             ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/failure.py", line 518, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 890, in _runInteraction
    result = await self.runWithConnection(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 1019, in runWithConnection
    return await make_deferred_yieldable(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/threadpool.py", line 244, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/threadpool.py", line 260, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
    result = func(conn, *args, **kw)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 1012, in inner_func
    return func(db_conn, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 752, in new_transaction
    r = func(cursor, *args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/events_bg_updates.py", line 1403, in drop_invalid_event_edges_txn
    txn.execute(
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 417, in execute
    self._do_execute(self.txn.execute, sql, parameters)
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 469, in _do_execute
    return func(sql, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.ForeignKeyViolation: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($ukE7gZZjUyJ6AWo3j-yziYIwtm5QBWh4_V-AwoChtqs) is not present in table "events".

Anything else that would be useful to know?

No response

@DMRobertson
Copy link
Contributor

Relevant source:

logger.info("cleaned up event_edges; enabling foreign key")
txn.execute(
"ALTER TABLE event_edges VALIDATE CONSTRAINT event_edges_event_id_fkey"
)
return True

used portdb. yes

I fear this is a consequence of #13191 :(

The safest option would be to purge the room, using

  • SELECT room_id FROM events WHERE event_id = '$ukE7gZZjUyJ6AWo3j-yziYIwtm5QBWh4_V-AwoChtqs';
  • then use the admin API to purge the room.

Then restart Synapse and see if the error remains. Let us know if that solves the issue.

@DMRobertson DMRobertson added the X-Needs-Info This issue is blocked awaiting information from the reporter label Aug 29, 2023
@tusooa
Copy link
Author

tusooa commented Sep 11, 2023

Relevant source:

logger.info("cleaned up event_edges; enabling foreign key")
txn.execute(
"ALTER TABLE event_edges VALIDATE CONSTRAINT event_edges_event_id_fkey"
)
return True

used portdb. yes

I fear this is a consequence of #13191 :(

The safest option would be to purge the room, using

* `SELECT room_id FROM events WHERE event_id = '$ukE7gZZjUyJ6AWo3j-yziYIwtm5QBWh4_V-AwoChtqs';`

* then use the [admin API to purge the room](https://matrix-org.github.io/synapse/latest/admin_api/rooms.html#version-2-new-version).

Then restart Synapse and see if the error remains. Let us know if that solves the issue.

When I am purging the room, it failed with another error:

{"status":"failed","shutdown_room":{"kicked_users":[],"failed_to_kick_users":[],"local_aliases":[],"new_room_id":null},"error":"canceling statement due to statement timeout\nCONTEXT:  SQL statement \"SELECT 1 FROM ONLY \"public\".\"room_memberships\" x WHERE $1 OPERATOR(pg_catalog.=) \"event_stream_ordering\" FOR KEY SHARE OF x\"\n"}

@DMRobertson
Copy link
Contributor

DMRobertson commented Sep 18, 2023

When I am purging the room, it failed with another error:

This looks like a regression in #15853. When there's a fix for #16322, please try purging that room again.

@DMRobertson
Copy link
Contributor

When there's a fix for #16322, please try purging that room again.

A fix (#16455) landed in Synapse 1.95. Have you had the chance to try re-purging the room?

@tusooa
Copy link
Author

tusooa commented Dec 13, 2023

When there's a fix for #16322, please try purging that room again.

A fix (#16455) landed in Synapse 1.95. Have you had the chance to try re-purging the room?

still failed (v1.97.0)

2023-12-12 06:49:39,218 - synapse.storage.txn - 780 - WARNING - task-shutdown_and_purge_room-0-lZuEfLZoWVFejzsJ-!YTvKGNlinIzlkMTVRl:matrix.org - [TXN OPERROR] {purge_room-47c} canceling statement due to statement timeout
2023-12-12 06:49:39,245 - synapse.util.task_scheduler - 362 - ERROR - task-shutdown_and_purge_room-0-lZuEfLZoWVFejzsJ - scheduled task lZuEfLZoWVFejzsJ failed

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter
Projects
None yet
Development

No branches or pull requests

2 participants