Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

synapse_port_db - Failed to insert: event_edges #13191

Closed
ToeiRei opened this issue Jul 5, 2022 · 15 comments · Fixed by #13226
Closed

synapse_port_db - Failed to insert: event_edges #13191

ToeiRei opened this issue Jul 5, 2022 · 15 comments · Fixed by #13226
Assignees
Labels
A-Portdb Script for exporting sqlite data into postgres S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@ToeiRei
Copy link

ToeiRei commented Jul 5, 2022

Description

Migrating the database from Sqlite3 to Postgres using the synapse_port_db script I run into an error as follows:

2022-07-05 23:32:18,631 - synapse_port_db - 245 - ERROR - Failed to insert: event_edges

Steps to reproduce

  • start with a sqlite 3 instance on debian 11
  • try to run the converter
  • watch the converter do its job until it hits the error
  • cry in pain

Homeserver

stargazer.at

Synapse Version

{"server":{"name":"Synapse","version":"1.62.0"}}

Installation Method

Debian packages from packages.matrix.org

Platform

Debian 11 in a VM, minimum install

Relevant log output

2022-07-05 23:32:18,631 - synapse_port_db - 245 - ERROR - Failed to insert: event_edges
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py", line 243, in insert_many_txn
    txn.executemany(sql, rows)
  File "/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 356, in executemany
    self._do_execute(self.txn.executemany, sql, *args)
  File "/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 395, in _do_execute
    return func(sql, *args, **kwargs)
psycopg2.errors.ForeignKeyViolation: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($70n7eZhMwy80aMWDetgskT9nZ4OnAMVb_NVARjMOHpM) is not present in table "events".

2022-07-05 23:32:18,635 - synapse_port_db - 786 - ERROR -
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py", line 763, in run
    await make_deferred_yieldable(
twisted.internet.defer.FirstError: FirstError[#6, [Failure instance: Traceback: <class 'psycopg2.errors.ForeignKeyViolation'>: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($70n7eZhMwy80aMWDetgskT9nZ4OnAMVb_NVARjMOHpM) is not present in table "events".

/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:700:errback
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:763:_startRunCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:857:_runCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:1750:gotResult
--- <exception caught here> ---
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py:483:handle_table
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:835:runInteraction
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:807:_runInteraction
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:930:runWithConnection
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/threadpool.py:244:inContext
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/threadpool.py:260:<lambda>
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/context.py:117:callWithContext
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/context.py:82:callWithContext
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/enterprise/adbapi.py:282:_runWithConnection
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:923:inner_func
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:671:new_transaction
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py:471:insert
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py:243:insert_many_txn
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:356:executemany
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:395:_do_execute
]]
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py", line 763, in run
    await make_deferred_yieldable(
twisted.internet.defer.FirstError: FirstError[#6, [Failure instance: Traceback: <class 'psycopg2.errors.ForeignKeyViolation'>: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($70n7eZhMwy80aMWDetgskT9nZ4OnAMVb_NVARjMOHpM) is not present in table "events".

/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:700:errback
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:763:_startRunCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:857:_runCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:1750:gotResult
--- <exception caught here> ---
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py:483:handle_table
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:835:runInteraction
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:807:_runInteraction
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:930:runWithConnection
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/threadpool.py:244:inContext
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/threadpool.py:260:<lambda>
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/context.py:117:callWithContext
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/context.py:82:callWithContext
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/enterprise/adbapi.py:282:_runWithConnection
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:923:inner_func
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:671:new_transaction
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py:471:insert
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py:243:insert_many_txn
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:356:executemany
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:395:_do_execute
]]
FirstError[#6, [Failure instance: Traceback: <class 'psycopg2.errors.ForeignKeyViolation'>: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($70n7eZhMwy80aMWDetgskT9nZ4OnAMVb_NVARjMOHpM) is not present in table "events".

/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:700:errback
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:763:_startRunCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:857:_runCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:1750:gotResult
--- <exception caught here> ---
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py:483:handle_table
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:835:runInteraction
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/internet/defer.py:1656:_inlineCallbacks
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/failure.py:514:throwExceptionIntoGenerator
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:807:_runInteraction
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:930:runWithConnection
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/threadpool.py:244:inContext
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/threadpool.py:260:<lambda>
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/context.py:117:callWithContext
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/python/context.py:82:callWithContext
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/twisted/enterprise/adbapi.py:282:_runWithConnection
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:923:inner_func
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:671:new_transaction
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py:471:insert
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/_scripts/synapse_port_db.py:243:insert_many_txn
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:356:executemany
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/synapse/storage/database.py:395:_do_execute


### Anything else that would be useful to know?

_No response_
@DMRobertson DMRobertson added A-Portdb Script for exporting sqlite data into postgres S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Needs-Info This issue is blocked awaiting information from the reporter labels Jul 6, 2022
@DMRobertson
Copy link
Contributor

I'm not sure if this is a bug in portdb or if your sqlite database is corrupted.

What happens if you stop Synapse and run this query against sqlite?

SELECT * from events where event_id='$70n7eZhMwy80aMWDetgskT9nZ4OnAMVb_NVARjMOHpM';

@ToeiRei
Copy link
Author

ToeiRei commented Jul 6, 2022

$ sqlite3 homeserver.db
SQLite version 3.34.1 2021-01-20 14:10:07
Enter ".help" for usage hints.
sqlite> SELECT * from events where event_id='$70n7eZhMwy80aMWDetgskT9nZ4OnAMVb_NVARjMOHpM';
-8558|2|$70n7eZhMwy80aMWDetgskT9nZ4OnAMVb_NVARjMOHpM|m.room.member|!UpbiadsgRndwvYKPLK:neko.dev|||1|1|2|1636376074991|1656923521381|@deepbluev7:neko.dev|0|master|@deepbluev7:neko.dev|
sqlite>

@DMRobertson
Copy link
Contributor

Thanks. Presumably a portdb bug then. I would guess that we don't make any effort to port over tables that are cited by foreign key constraints before the tables that cite them.

@ToeiRei
Copy link
Author

ToeiRei commented Jul 6, 2022

Feel free to use me as a crash test dummy on debugging. I run the importer again on a more recent snapshot and the event_id is the same.

@DMRobertson DMRobertson self-assigned this Jul 6, 2022
@DMRobertson
Copy link
Contributor

Proposed (short term) fix:

  • have port db maintain a list of foreign key dependencies between tables
  • ensure we port cited tables before citing tables
  • need to ensure this works in incremental mode: can't have a append-only table refer to a non-append only table

@DMRobertson
Copy link
Contributor

Although we seemingly already try to do this?

constraints = await self.get_table_constraints()
tables_ported = set() # type: Set[str]
while tables_to_port_info_map:
# Pulls out all tables that are still to be ported and which
# only depend on tables that are already ported (if any).
tables_to_port = [
table
for table in tables_to_port_info_map
if not constraints.get(table, set()) - tables_ported
]
await make_deferred_yieldable(
defer.gatherResults(
[
run_in_background(
self.handle_table,
table,
*tables_to_port_info_map.pop(table),
)
for table in tables_to_port
],
consumeErrors=True,
)
)
tables_ported.update(tables_to_port)

@DMRobertson DMRobertson removed the X-Needs-Info This issue is blocked awaiting information from the reporter label Jul 6, 2022
@ToeiRei
Copy link
Author

ToeiRei commented Jul 6, 2022

What info can I provide to help you out further?

I run this against a PostgreSQL (cluster) V12

@DMRobertson
Copy link
Contributor

DMRobertson commented Jul 6, 2022

A few questions:

  • Is Synapse running while you run the portdb script?

  • Have you ran the script before? Were there any other errors?

  • Could you also run this query against the postgres DB in its current state and let me know what you get?

                SELECT DISTINCT
                    tc.table_name,
                    ccu.table_name AS foreign_table_name
                FROM
                    information_schema.table_constraints AS tc
                    INNER JOIN information_schema.constraint_column_usage AS ccu
                    USING (table_schema, constraint_name)
                WHERE tc.constraint_type = 'FOREIGN KEY'
                  AND tc.table_name != ccu.table_name;
  • Do you get the same error if you rerun the edit: portdb script?

@ToeiRei
Copy link
Author

ToeiRei commented Jul 6, 2022

  • Synapse was shut down and I worked on a copy of the database
  • Tried the script before onto an empty database -> same error
  • Tried to rerun the script into the database from before -> same error
  • Tried to run it against a newer database -> same error
  • Query:
    image
  • Script: where do I get it from?

@DMRobertson
Copy link
Contributor

Script: where do I get it from?

Sorry, I meant the sqlite-to-postgres script, but as you say here

Tried to rerun the script into the database from before -> same error

you've already been running the script multiple times, which is all I wanted to know.

  • Query

Thanks, this confirms that we should be trying to port the edges table before the event_edges.

2022-07-05 23:32:18,631 - synapse_port_db - 245 - ERROR - Failed to insert: event_edges

Do you have any logs prior to this from running synapse_port_db? Ideally, could you provide the full logs please? If you're anxious that they contain sensitive data, you could send me them directly, e.g. by emailing davidr@element.io or contacting me at @dmrobertson:matrix.org.

@DMRobertson DMRobertson added the X-Needs-Info This issue is blocked awaiting information from the reporter label Jul 7, 2022
@ToeiRei
Copy link
Author

ToeiRei commented Jul 8, 2022

Logrotate already ate the logs from the beginning - so I did the next best thing:
grep -v INFO on the logs to check for warnings

Stuff I found in there wasn't interesting at all - mostly consisting of

Users not being found as in

ynapse.http.matrixfederationclient - 700 - WARNING - GET-21980 - {GET-O-12406} [some.matrix.server] Request failed: GET matrix://notyour.chat/_matrix/federation/v1/query/profile?user_id=someuser:somewheret&field=displayname: HttpResponseException('403: Forbidden')

or Servers down stuff like

synapse.replication.tcp.handler - 736 - WARNING - store_retry_timings-3 - Dropping command as not connected: 'REMOTE_SERVER_UP'

No Tracebacks or any other severe things showed up as non-info chatter.

@DMRobertson
Copy link
Contributor

Sorry, I mean the logs from running synapse_port_db script, not from running Synapse itself. That is: what comes before this line from your report?

2022-07-05 23:32:18,631 - synapse_port_db - 245 - ERROR - Failed to insert: event_edges

@DMRobertson
Copy link
Contributor

DMRobertson commented Jul 8, 2022

Thanks @ToeiRei for providing extra logs and a copy of your database privately. There is something very weird with the way that the portdb script is selecting rows to insert into postgres. It makes two "sweeps" based on sqlite rowids: a forwards and backwards sweep.

With some extra debug we can see this more easily.

│2022-07-08 14:06:26,112 - synapse_port_db - 395 - INFO - Table events: 0/9725 (rows 1-0) already ported                                                                                                                                                                                   │
│2022-07-08 14:06:26,150 - synapse_port_db - 469 - WARNING - 1000 frows (4 to 1028), 1000 brows (-106819 to -100366)                                                                                                                                                                       │
│2022-07-08 14:06:27,011 - synapse_port_db - 469 - WARNING - 1000 frows (1029 to 2028), 0 brows (None to None)                                                                                                                                                                             │
│2022-07-08 14:06:27,508 - synapse_port_db - 469 - WARNING - 1000 frows (2029 to 3028), 0 brows (None to None)                                                                                                                                                                             │
│2022-07-08 14:06:27,906 - synapse_port_db - 469 - WARNING - 454 frows (3029 to 3482), 0 brows (None to None)                                                                                                                                                                              │
│2022-07-08 14:06:28,162 - synapse_port_db - 469 - WARNING - 0 frows (None to None), 0 brows (None to None)

The reason I say this is weird is that the backward sweep seems to miss out a whole bunch of row ids:

sqlite> select max(rowid) from events where rowid < 4;
max(rowid)
-8557

And I note from #13191 (comment) that the missing event has rowid -8558.

DMRobertson pushed a commit that referenced this issue Jul 8, 2022
The `backward_select` query uses the `backward_chunk` variable as an
inclusive upper bound on the rowids it selects. It is initially 0 (see
`setup_table`). It is then set to

```
backward_chunk = min(row[0] for row in brows) - 1
```

where `brows` is the result of running the `backwards_select` query.
For this to make sense, we need to ensure that `backwards_select` picks
rows in descending order. Otherwise we'll jump right to the bottom of
the rowids, pick out the lowest batch only and discard everything we
skipped over. This is a Bad Thing.

I've tested this locally with the reproduction case reported in #13191.
Without the patch, I could reproduce the reported failure; with the
patch, the portdb script completes successfully.
@DMRobertson DMRobertson linked a pull request Jul 8, 2022 that will close this issue
@DMRobertson DMRobertson removed the X-Needs-Info This issue is blocked awaiting information from the reporter label Jul 8, 2022
@DMRobertson
Copy link
Contributor

Thanks again for your help in tracking this down. It looks there is an (old!) bug in the portdb script (from before Synapse 1.0.0). #13226 proposes a fix.

@ToeiRei
Copy link
Author

ToeiRei commented Jul 8, 2022

If you want me to test stuff, let me know

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Portdb Script for exporting sqlite data into postgres S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants