{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":401416812,"defaultBranch":"master","name":"dbmirror2","ownerLogin":"metabrainz","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2021-08-30T16:44:02.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/293421?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1683823660.46409","currentOid":""},"activityList":{"items":[{"before":"9ee398737f5dfa533366096cf5079bf1b90220f9","after":null,"ref":"refs/heads/nix-pending-keys-insertion","pushedAt":"2023-05-11T16:47:40.464Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"mwiencek","name":"Michael Wiencek","path":"/mwiencek","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1056556?s=80&v=4"}},{"before":"570b8c64f820755dfedb66a9cfc9e38c967511c1","after":"eda9923f74e9eee81d9358dc7f0a1843a08c8c3d","ref":"refs/heads/master","pushedAt":"2023-05-11T16:47:26.744Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"mwiencek","name":"Michael Wiencek","path":"/mwiencek","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1056556?s=80&v=4"},"commit":{"message":"Nix pending_keys insertion (#2)\n\nDue to the way ExportAllTables works in musicbrainz-server, there's currently a\r\nrace condition preventing the pending_keys table from being updated in some\r\ncases. The result is that some replication packets are missing table names from\r\nthe pending_keys file which are clearly present in the pending_data file.\r\n\r\nImportant: there's no issue with pending_data or pending_ts, so the integrity\r\nof existing dbmirror2 packets is otherwise fine. Also, fortunately, there's\r\nnothing critical about the pending_keys data; it's not even needed to apply the\r\npackets on the mirror side, since primary keys can be found by querying the\r\nrunning database.\r\n\r\nBut back to the actual issue. When producing a replication packet (which always\r\nhappens within a serializable transaction), there are two steps with respect to\r\nthe pending_keys table:\r\n\r\n * `COPY (SELECT * FROM dbmirror2.pending_keys) TO STDOUT` (roughly)\r\n * `DELETE FROM dbmirror2.pending_keys`\r\n\r\nThere are also two situations where we produce replications packets: with or\r\nwithout a full export. Let's call these full-export and standalone packets\r\nrespectively.\r\n\r\n * full-export packets are only produced twice a week, on Wed and Sat, from\r\n daily.sh. They're dumped from within the same serializable transaction that\r\n the full export is, so that the data in the packet is consistent with the\r\n replication sequence of the dump.\r\n * standalone packets are produced hourly, from hourly.sh. They can't overlap\r\n with a full export; a file lock ensures only one\r\n `MusicBrainz::Script::DatabaseDump` instance is running at a time.\r\n\r\nThere's nothing intrinsically different about either of these types of packets.\r\nThey're just timed such that full exports can be logically consistent with\r\ntheir starting sequence.\r\n\r\nThe issue with pending_keys occurs while a full-export packet is being\r\nproduced, and manifests in the next hourly standalone packet. The sequence of\r\nevents that causes missing data is as follows:\r\n\r\n 1. A full export starts.\r\n 2. The pending_keys table is dumped for the full-export packet.\r\n 3. Concurrent recordchange calls (which apply to the next standalone packet)\r\n insert into the pending_keys table, but insertions for table names that\r\n already exist are skipped due to the \"ON CONFLICT DO NOTHING\" clause.\r\n 4. The pending_keys table is cleared within the full-export transaction.\r\n 5. The full-export transaction commits.\r\n\r\nAny table names inserted between steps 3 and 4 which are skipped due to the \"ON\r\nCONFLICT DO NOTHING\" clause won't appear in the next standalone packet's\r\npending_keys. (Concurrent pending_keys insertions after step 4 are not an\r\nissue, because they are blocking if they conflict.)\r\n\r\nIt's not necessary to log pending_keys data in recordchange. We can simply\r\ncalculate it from pending_data when producing the packet in ExportAllTables\r\n(from within its serializable transaction).","shortMessageHtmlLink":"Nix pending_keys insertion (#2)"}},{"before":null,"after":"9ee398737f5dfa533366096cf5079bf1b90220f9","ref":"refs/heads/nix-pending-keys-insertion","pushedAt":"2023-04-21T19:35:36.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"mwiencek","name":"Michael Wiencek","path":"/mwiencek","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1056556?s=80&v=4"},"commit":{"message":"Nix pending_keys insertion\n\nDue to the way ExportAllTables works in musicbrainz-server, there's currently a\nrace condition preventing the pending_keys table from being updated in some\ncases. The result is that some replication packets are missing table names from\nthe pending_keys file which are clearly present in the pending_data file.\n\nImportant: there's no issue with pending_data or pending_ts, so the integrity\nof existing dbmirror2 packets is otherwise fine. Also, fortunately, there's\nnothing critical about the pending_keys data; it's not even needed to apply the\npackets on the mirror side, since primary keys can be found by querying the\nrunning database.\n\nBut back to the actual issue. When producing a replication packet (which always\nhappens within a serializable transaction), there are two steps with respect to\nthe pending_keys table:\n\n * `COPY (SELECT * FROM dbmirror2.pending_keys) TO STDOUT` (roughly)\n * `DELETE FROM dbmirror2.pending_keys`\n\nThere are also two situations where we produce replications packets: with or\nwithout a full export. Let's call these full-export and standalone packets\nrespectively.\n\n * full-export packets are only produced twice a week, on Wed and Sat, from\n daily.sh. They're dumped from within the same serializable transaction that\n the full export is, so that the data in the packet is consistent with the\n replication sequence of the dump.\n * standalone packets are produced hourly, from hourly.sh. They can't overlap\n with a full export; a file lock ensures only one\n `MusicBrainz::Script::DatabaseDump` instance is running at a time.\n\nThere's nothing intrinsically different about either of these types of packets.\nThey're just timed such that full exports can be logically consistent with\ntheir starting sequence.\n\nThe issue with pending_keys occurs while a full-export packet is being\nproduced, and manifests in the next hourly standalone packet. The sequence of\nevents that causes missing data is as follows:\n\n 1. A full export starts.\n 2. The pending_keys table is dumped for the full-export packet.\n 3. Concurrent recordchange calls (which apply to the next standalone packet)\n insert into the pending_keys table, but insertions for table names that\n already exist are skipped due to the \"ON CONFLICT DO NOTHING\" clause.\n 4. The pending_keys table is cleared within the full-export transaction.\n 5. The full-export transaction commits.\n\nAny table names inserted between steps 3 and 4 which are skipped due to the \"ON\nCONFLICT DO NOTHING\" clause won't appear in the next standalone packet's\npending_keys. (Concurrent pending_keys insertions after step 4 are not an\nissue, because they are blocking if they conflict.)\n\nIt's not necessary to log pending_keys data in recordchange. We can simply\ncalculate it from pending_data when producing the packet in ExportAllTables\n(from within its serializable transaction).","shortMessageHtmlLink":"Nix pending_keys insertion"}},{"before":"7ba642243b34ac22e02fd2a0a15eaf1543386a35","after":"570b8c64f820755dfedb66a9cfc9e38c967511c1","ref":"refs/heads/master","pushedAt":"2023-04-21T19:24:10.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"mwiencek","name":"Michael Wiencek","path":"/mwiencek","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1056556?s=80&v=4"},"commit":{"message":"benchmark/OldReplicationSetup: PostgreSQL 14 compatibility\n\nSynced from\nhttps://github.com/metabrainz/musicbrainz-server/commit/38861c768d427970e9088cefc76604b05ee11bab","shortMessageHtmlLink":"benchmark/OldReplicationSetup: PostgreSQL 14 compatibility"}}],"hasNextPage":false,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"Y3Vyc29yOnYyOpK7MjAyMy0wNS0xMVQxNjo0Nzo0MC40NjQwOTBazwAAAAMq45vu","startCursor":"Y3Vyc29yOnYyOpK7MjAyMy0wNS0xMVQxNjo0Nzo0MC40NjQwOTBazwAAAAMq45vu","endCursor":"Y3Vyc29yOnYyOpK7MjAyMy0wNC0yMVQxOToyNDoxMC4wMDAwMDBazwAAAAMd2Emh"}},"title":"Activity ยท metabrainz/dbmirror2"}