Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

synapse_port_db produces unusable database due to numerous duplicate key constraint errors #13110

Closed
hutchisr opened this issue Jun 21, 2022 · 7 comments
Labels
A-Portdb Script for exporting sqlite data into postgres S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Needs-Info This issue is blocked awaiting information from the reporter

Comments

@hutchisr
Copy link

hutchisr commented Jun 21, 2022

Description

Porting a synapse homeserver from sqlite to postgresql results in an unusable server where any action (sending a message, leaving a room, etc) fails due to duplicate key errors in event_json, event_auth tables.

Steps to reproduce

  • stop down matrix-synapse.service
  • backup sqlite db "snapshot": cp homeserver.db homeserver-snapshot.db
  • create postgres database: createdb --encoding=UTF8 --locale=C --template=template0 --owner=matrix-synapse matrix
  • copy homeserver.yaml and configure to use postgresql database
  • run synapse_port_db --sqlite-database /var/lib/matrix-synapse/homeserver-snapshot.db --postgres-config homeserver-postgresql.yaml
  • update homeserver.yaml to use Postgres db
  • start matrix-synapse.service
  • start element and attempt to send a message

Homeserver

homunyan.com

Synapse Version

1.61.0

Installation Method

Debian packages from packages.matrix.org

Platform

Ubuntu 22.04 Virtual Machine

Relevant log output

Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/federation_event.py", line 867, in _resolve_state_at_missing_prevs
    await self._get_state_ids_after_missing_prev_event(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/federation_event.py", line 969, in _get_state_ids_after_missing_prev_event
    await self._get_state_and_persist(destination, room_id, event_id)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/federation_event.py", line 1046, in _get_state_and_persist
    await self._auth_and_persist_outliers(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/federation_event.py", line 1399, in _auth_and_persist_outliers
    await self._auth_and_persist_outliers_inner(room_id, roots)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/federation_event.py", line 1465, in _auth_and_persist_outliers_inner
    await self.persist_events_and_notify(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/handlers/federation_event.py", line 2015, in persist_events_and_notify
    ) = await self._storage_controllers.persistence.persist_events(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/logging/opentracing.py", line 811, in _trace_inner
    return await func(*args, **kwargs)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/controllers/persist_events.py", line 331, in persist_events
    ret_vals = await yieldable_gather_results(enqueue, partitioned.items())
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/util/async_helpers.py", line 291, in yieldable_gather_results
    raise dfe.subFailure.value from None
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 1656, in _inlineCallbacks
    result = current_context.run(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/failure.py", line 514, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/controllers/persist_events.py", line 327, in enqueue
    return await self._event_persist_queue.add_to_queue(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/controllers/persist_events.py", line 201, in add_to_queue
    res = await make_deferred_yieldable(end_item.deferred.observe())
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/controllers/persist_events.py", line 244, in handle_queue_loop
    ret = await self._per_item_callback(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/controllers/persist_events.py", line 630, in _persist_event_batch
    await self.persist_events_store._persist_events_and_state_updates(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/databases/main/events.py", line 188, in _persist_events_and_state_updates
    await self.db_pool.runInteraction(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 834, in runInteraction
    return await delay_cancellation(_runInteraction())
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py", line 1656, in _inlineCallbacks
    result = current_context.run(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/failure.py", line 514, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 806, in _runInteraction
    result = await self.runWithConnection(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 929, in runWithConnection
    return await make_deferred_yieldable(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/threadpool.py", line 244, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/threadpool.py", line 260, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 922, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 670, in new_transaction
    r = func(cursor, *args, **kwargs)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/databases/main/events.py", line 406, in _persist_events_txn
    self._store_event_txn(txn, events_and_contexts=events_and_contexts)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/databases/main/events.py", line 1393, in _store_event_txn
    self.db_pool.simple_insert_many_txn(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 1068, in simple_insert_many_txn
    txn.execute_values(sql, values, fetch=False)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 346, in execute_values
    return self._do_execute(
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 394, in _do_execute
    return func(sql, *args, **kwargs)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 347, in <lambda>
    lambda the_sql: execute_values(self.txn, the_sql, values, fetch=fetch),
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/psycopg2/extras.py", line 1270, in execute_values
    cur.execute(b''.join(parts))
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "event_json_event_id_key"
DETAIL:  Key (event_id)=($xd40t16UrIqo4_AvnHvYrXbSZAg_RtBUrVyC9IkirD8) already exists.

Anything else that would be useful to know?

Deleting the entries with the offending keys doesn't help since there are a seemingly unending number of keys that shouldn't exist.

@DMRobertson
Copy link
Contributor

DMRobertson commented Jun 22, 2022

What happens if you run these queries against the sqlite version of your DB?

SELECT COUNT(*) FROM events WHERE event_id = '$xd40t16UrIqo4_AvnHvYrXbSZAg_RtBUrVyC9IkirD8';
SELECT COUNT(*) FROM event_json WHERE event_id = '$xd40t16UrIqo4_AvnHvYrXbSZAg_RtBUrVyC9IkirD8';

@DMRobertson DMRobertson added the X-Needs-Info This issue is blocked awaiting information from the reporter label Jun 22, 2022
@hutchisr
Copy link
Author

hutchisr commented Jun 22, 2022

Both queries return 1. Doesn't appear that there are duplicate events already in the database but something is causing new events to try to use ids that already exist?

@DMRobertson DMRobertson added X-Needs-Discussion and removed X-Needs-Info This issue is blocked awaiting information from the reporter labels Jun 22, 2022
@anoadragon453
Copy link
Member

What happens if you try:

SELECT outlier FROM events WHERE event_id = '$xd40t16UrIqo4_AvnHvYrXbSZAg_RtBUrVyC9IkirD8';

against both the SQLite and PostgreSQL versions of the database?

From _resolve_state_at_missing_prev_events, Synapse is trying to check if it has certain events, and search for them by (event_id, outlier = False). If the events are in the database, but marked as outlier = True, then you'd end up attempting to request and store them again (and then run into a duplicate key violation).

It'd be interesting to see if the results from the query above were inconsistent somehow.

@anoadragon453 anoadragon453 added X-Needs-Info This issue is blocked awaiting information from the reporter and removed X-Needs-Discussion labels Jun 30, 2022
@DMRobertson DMRobertson added A-Portdb Script for exporting sqlite data into postgres S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. labels Jul 6, 2022
@hutchisr
Copy link
Author

Hey, sorry for the delay. Unfortunately I cleared out the Postgres database to save space and trying to migrate again no longer succeeds, giving me a new set of errors:

2022-07-17 01:31:46,154 - synapse_port_db - 245 - ERROR - Failed to insert: event_edges
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py", line 243, in insert_many_txn
    txn.executemany(sql, rows)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 356, in executemany
    self._do_execute(self.txn.executemany, sql, *args)
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py", line 395, in _do_execute
    return func(sql, *args, **kwargs)
psycopg2.errors.ForeignKeyViolation: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($5QsUEAIlpUFlA7dpxlKijXdXK73HkL2t_y519rDuGT0) is not present in table "events".

2022-07-17 01:31:46,162 - synapse_port_db - 786 - ERROR -
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py", line 763, in run
    await make_deferred_yieldable(
twisted.internet.defer.FirstError: FirstError[#5, [Failure instance: Traceback: <class 'psycopg2.errors.ForeignKeyViolation'>: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($5QsUEAIlpUFlA7dpxlKijXdXK73HkL2t_y519rDuGT0) is not present in table "events".

/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:700:errback
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:763:_startRunCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:857:_runCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:1750:gotResult
--- <exception caught here> ---
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py:483:handle_table
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:835:runInteraction
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:807:_runInteraction
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:930:runWithConnection
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/threadpool.py:244:inContext
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/threadpool.py:260:<lambda>
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/context.py:117:callWithContext
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/context.py:82:callWithContext
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/enterprise/adbapi.py:282:_runWithConnection
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:923:inner_func
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:671:new_transaction
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py:471:insert
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py:243:insert_many_txn
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:356:executemany
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:395:_do_execute
]]
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py", line 763, in run
    await make_deferred_yieldable(
twisted.internet.defer.FirstError: FirstError[#5, [Failure instance: Traceback: <class 'psycopg2.errors.ForeignKeyViolation'>: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($5QsUEAIlpUFlA7dpxlKijXdXK73HkL2t_y519rDuGT0) is not present in table "events".

/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:700:errback
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:763:_startRunCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:857:_runCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:1750:gotResult
--- <exception caught here> ---
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py:483:handle_table
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:835:runInteraction
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:807:_runInteraction
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:930:runWithConnection
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/threadpool.py:244:inContext
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/threadpool.py:260:<lambda>
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/context.py:117:callWithContext
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/context.py:82:callWithContext
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/enterprise/adbapi.py:282:_runWithConnection
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:923:inner_func
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:671:new_transaction
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py:471:insert
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py:243:insert_many_txn
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:356:executemany
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:395:_do_execute
]]
FirstError[#5, [Failure instance: Traceback: <class 'psycopg2.errors.ForeignKeyViolation'>: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($5QsUEAIlpUFlA7dpxlKijXdXK73HkL2t_y519rDuGT0) is not present in table "events".

/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:700:errback
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:763:_startRunCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:857:_runCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:1750:gotResult
--- <exception caught here> ---
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py:483:handle_table
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:835:runInteraction
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:807:_runInteraction
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:930:runWithConnection
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/threadpool.py:244:inContext
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/threadpool.py:260:<lambda>
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/context.py:117:callWithContext
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/python/context.py:82:callWithContext
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/twisted/enterprise/adbapi.py:282:_runWithConnection
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:923:inner_func
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:671:new_transaction
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py:471:insert
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/_scripts/synapse_port_db.py:243:insert_many_txn
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:356:executemany
/opt/venvs/matrix-synapse/lib/python3.10/site-packages/synapse/storage/database.py:395:_do_execute

@DMRobertson
Copy link
Contributor

Those new set of errors sound like #13191. The fix for that should be released in Synapse 1.63, due for release on Tuesday. There's also a release candidate 1.63.0rc1 available if you're keen to

@hutchisr: could you do the following:

  • Clear your postgres DB
  • Running the portdb script after Synapse is version 1.63.0rc1, 1.63.0 or higher
  • Let us know if it succeeds or fails. We can take it from there.

@aphick
Copy link

aphick commented Jul 28, 2022

I upgraded my instance from v1.60.0 to v1.63.1 and the script completed without issue however my postgres logs filled up with messages like

2022-07-28 18:05:29.261 UTC [2068] ERROR:  duplicate key value violates unique constraint "event_json_event_id_key"
2022-07-28 18:05:29.261 UTC [2068] DETAIL:  Key (event_id)=($yIGPjjEoqqUQOqcC4UoSaZyavjawQUCj3wPjtSuJ5H4) already exists.
2022-07-28 18:05:29.261 UTC [2068] STATEMENT:  INSERT INTO event_json (event_id, room_id, internal_metadata, json, format_version) VALUES ('$yIGPjjEoqqUQOqcC4UoSaZyavjawQUCj3wPjtSuJ5H4','NA','{}','{"auth_events":[],"content":{"creator":""},"","room_version":"9"},"depth":1,"hashes":{"sha256":"NA"},"origin":"matrix.org","origin_server_ts":1651383281966,"prev_events":[],"prev_state":[],"room_id":"NA","sender":"NA","state_key":"","type":"m.room.create","signatures":{"matrix.org":{"ed25519:a_RXGa":"NA"}},"unsigned":{"age_ts":1651383281966}}',3)

I then tried reverting to a backup and rerunning the script just once (took the server down and ported homeserver.db directly) and am no longer seeing these messages.

@erikjohnston
Copy link
Member

Sounds like #13191 fixed this issue then. Please shout if that isn't the case!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Portdb Script for exporting sqlite data into postgres S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Needs-Info This issue is blocked awaiting information from the reporter
Projects
None yet
Development

No branches or pull requests

5 participants