Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed deadlock after issuing a simple UPDATE statement #7477

Closed
lkral-navmatix opened this issue Feb 1, 2024 · 13 comments · Fixed by #7542
Closed

Distributed deadlock after issuing a simple UPDATE statement #7477

lkral-navmatix opened this issue Feb 1, 2024 · 13 comments · Fixed by #7542
Assignees

Comments

@lkral-navmatix
Copy link

Hello, we have been running into a puzzling situation with distributed deadlocks and after confirmation from multiple sources that this shouldn't be occuring in our case I have decided to file this bug report.
The situation currently is that a fairly simple query can sporadically trigger deadlocks on one of our tables, resulting in two possible error messages "canceling the transaction since it was involved in a distributed deadlock" and "deadlock detected".
The problem first surfaced when issuing an UPDATE (see below for details) in an explicit transaction, followed by a SELECT that reads from a different table.
But after trying to workaround this issue, it was found out that this can be triggered even by using a single UPDATE statement in an implicit transaction and also in a transaction after setting SET LOCAL citus.multi_shard_modify_mode TO 'sequential';.
We managed to trigger this by using different clients as well (our application which uses libpq under the hood and DBeaver).

Citus: 12.1.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
PostgreSQL: 16.1 (installed as 16.1, not upgraded from 16.0) + PostGIS 3.4.1
OS: Ubuntu 22.04.3 LTS
Installed packages:

  postgis-doc/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 all [installed,automatic]
  postgis/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 amd64 [installed]
  postgresql-16-citus-12.1/jammy,now 12.1.1.citus-1 amd64 [installed]
  postgresql-16-pg-checksums/jammy-pgdg,now 1.1-6.pgdg22.04+1 amd64 [installed]
  postgresql-16-postgis-3-scripts/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 all [installed,automatic]
  postgresql-16-postgis-3/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 amd64 [installed,automatic]
  postgresql-16/jammy-pgdg,now 16.1-1.pgdg22.04+1 amd64 [installed,automatic]
  postgresql-client-16/jammy-pgdg,now 16.1-1.pgdg22.04+1 amd64 [installed,automatic]
  postgresql-client-common/jammy-pgdg,now 256.pgdg22.04+1 all [installed,automatic]
  postgresql-common/jammy-pgdg,now 256.pgdg22.04+1 all [installed,automatic]
  postgresql-postgis-scripts/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 all [installed,automatic]
  postgresql-postgis/jammy-pgdg,now 3.4.1+dfsg-1.pgdg22.04+1 amd64 [installed,automatic]

Setup:
4 servers:
- 1 coordinator
- 2 workers (each in separate worker group)
- 1 HAProxy (used to loadbalance over the two workers, we are using the query from any node feature for workload)
This is more or less an initial testing cluster, so standby replicas are currently not involved

Note that names of tables/columns/database/users have been adjusted, but originally do NOT contain anything that should cause issues (like very long strings).

The query in our normal workload is run in parallel periodically for each row (so each row is updated roughly every 5 seconds), but all of them are scoped down by providing the primary key of the row and the updates for a given PK are never run at the same time. Currently the table is very small (20 rows) and we managed to reproduce this even by calling the query manually without any other workload touching that table.

Query triggering the deadlock:
UPDATE table1 SET offending_column = '2024-01-31T00:00:00Z' where id = ANY-VALID-ID;

Schema of the offending table:

  CREATE TABLE public.table1 (
      id int8 NOT NULL GENERATED ALWAYS AS IDENTITY,
      column1 int2 NOT NULL,
      column2 int2 NOT NULL,
      offending_column timestamptz NULL,
      column3 uuid NULL,
      column4 uuid NULL,
      column5 int2 NOT NULL DEFAULT 0,
      column6 int8 NOT NULL,
      column7 int8 NULL,
      column8 int2 NOT NULL DEFAULT 0,
      column9 text NOT NULL DEFAULT '',
      column10 int8 NOT NULL,
      CONSTRAINT table1_pk PRIMARY KEY (id),
      CONSTRAINT table1_table2_fk FOREIGN KEY (column6) REFERENCES public.table2(id),
      CONSTRAINT table1_table3_fk FOREIGN KEY (column10) REFERENCES public.table3(id)
  );
  SELECT create_reference_table('public.table1');

PostgreSQL log section with deadlocks occuring:

2024-02-01 08:34:09.679 UTC [1912956] LOG:  [2024-02-01 08:34:09.679735+00] Distributed deadlock found among the following distributed transactions:
2024-02-01 08:34:09.679 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.680 UTC [1912956] LOG:  [2024-02-01 08:34:09.680026+00] [DistributedTransactionId: (1, 339134, 2024-02-01 08:34:09.056811+00)] = [WaitsFor transaction numbers: 448301][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 08:34:09.680 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.680 UTC [1912956] LOG:  [2024-02-01 08:34:09.680304+00] [DistributedTransactionId: (2, 448301, 2024-02-01 08:34:09.063957+00)] = [WaitsFor transaction numbers: 339134]
2024-02-01 08:34:09.680 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.680 UTC [1912956] LOG:  [2024-02-01 08:34:09.680336+00] Cancelling the following backend to resolve distributed deadlock (transaction number = 339134, pid = 2013523)
2024-02-01 08:34:09.680 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.680 UTC [2013523] dbowner@db ERROR:  canceling the transaction since it was involved in a distributed deadlock
2024-02-01 08:34:09.680 UTC [2013523] dbowner@db STATEMENT:  UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 08:34:09.782 UTC [1912956] LOG:  [2024-02-01 08:34:09.782398+00] Distributed deadlock found among the following distributed transactions:
2024-02-01 08:34:09.782 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.782 UTC [1912956] LOG:  [2024-02-01 08:34:09.782631+00] [DistributedTransactionId: (1, 339135, 2024-02-01 08:34:09.200594+00)] = [WaitsFor transaction numbers: 448302][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 08:34:09.782 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.782 UTC [1912956] LOG:  [2024-02-01 08:34:09.782896+00] [DistributedTransactionId: (2, 448302, 2024-02-01 08:34:09.239911+00)] = [WaitsFor transaction numbers: 339135,339137]
2024-02-01 08:34:09.782 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.783 UTC [1912956] LOG:  [2024-02-01 08:34:09.783086+00] [DistributedTransactionId: (1, 339137, 2024-02-01 08:34:09.556843+00)] = [WaitsFor transaction numbers: 339135][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 08:34:09.783 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.783 UTC [1912956] LOG:  [2024-02-01 08:34:09.783116+00] Cancelling the following backend to resolve distributed deadlock (transaction number = 339137, pid = 2013871)
2024-02-01 08:34:09.783 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 08:34:09.783 UTC [2013871] dbowner@db ERROR:  canceling the transaction since it was involved in a distributed deadlock
2024-02-01 08:34:09.783 UTC [2013871] dbowner@db STATEMENT:  UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 08:34:09.796 UTC [2013865] dbowner@db LOG:  duration: 595.865 ms  execute s613888: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 08:34:09.796 UTC [2013865] dbowner@db DETAIL:  parameters: $1 = '2024-02-01 08:34:09.099239+00', $2 = '562949953421325'

We are not experiencing any issues with other tables (60+ tables, both reference and distributed) and everything else in the cluster seems to work correctly.
I understand that the info provided is very likely not enough to find the issue and would appreciate guidance on collecting more relevant data. It has been recommended to use pg-lock-tracer, so we will attempt to use that to hopefully gain more insight.

@JelteF
Copy link
Contributor

JelteF commented Feb 1, 2024

Do you have a log containing the "deadlock detected" error too? Afaik that's an error coming from Postgres itself. Which would indicate that it's actually not just a distributed deadlock, but a deadlock on the node itself (which sometimes we detect as a distributed deadlock if our distributed deadlock detection triggers faster than postgres its local deadlock detection).

It's definitely possible for single statement UPDATE statement to cause deadlocks on one node, due to locking rows in different orders: https://stackoverflow.com/a/10246052/2570866

But it indeed sounds like that wouldn't apply to the situation you're describing. If I understand correctly you are not updating multiple rows with one statement?

@JelteF
Copy link
Contributor

JelteF commented Feb 1, 2024

In general to get to the bottom of these issues its important to know which exact queries deadlock on eachother and why. One way to do that is by disabling the deadlock detection and looking at the queries that block eachother, but that is likely a bad idea on a production system. If you can somehow reproduce issue in a testing environment that would definitely be useful.

@lkral-navmatix
Copy link
Author

I can confirm that we are indeed updating single row with single statement.

This is log from worker 1 containing "deadlock detected", which mentions this occuring on the worker 2 (log below):

2024-02-01 11:06:00.057 UTC [2013870] dbowner@db LOG:  duration: 1002.934 ms  execute s708573: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.057 UTC [2013870] dbowner@db DETAIL:  parameters: $1 = '2024-02-01 11:05:59.051657+00', $2 = '562949953421320'
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db ERROR:  deadlock detected
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db DETAIL:  Process 2108878 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
        Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
        Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108878.
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db HINT:  See server log for query details.
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db CONTEXT:  while executing command on (WORKER2-IP):5432
2024-02-01 11:06:00.233 UTC [2013864] dbowner@db STATEMENT:  UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db ERROR:  deadlock detected
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db DETAIL:  Process 2108421 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
        Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
        Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108421.
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db HINT:  See server log for query details.
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db CONTEXT:  while executing command on (WORKER2-IP):5432
2024-02-01 11:06:00.366 UTC [2013865] dbowner@db STATEMENT:  UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db ERROR:  deadlock detected
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db DETAIL:  Process 2108875 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
        Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
        Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108875.
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db HINT:  See server log for query details.
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db CONTEXT:  while executing command on (WORKER2-IP):5432
2024-02-01 11:06:00.558 UTC [2013536] dbowner@db STATEMENT:  UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.707 UTC [1912956] LOG:  [2024-02-01 11:06:00.707694+00] Distributed deadlock found among the following distributed transactions:
2024-02-01 11:06:00.707 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.707 UTC [1912956] LOG:  [2024-02-01 11:06:00.707935+00] [DistributedTransactionId: (1, 362459, 2024-02-01 11:05:59.200145+00)] = [WaitsFor transaction numbers: 471676,471677,471678,471679][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.707 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.708 UTC [1912956] LOG:  [2024-02-01 11:06:00.708194+00] [DistributedTransactionId: (2, 471679, 2024-02-01 11:05:59.980572+00)] = [WaitsFor transaction numbers: 471676,471677,471678]
2024-02-01 11:06:00.708 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.708 UTC [1912956] LOG:  [2024-02-01 11:06:00.708447+00] [DistributedTransactionId: (2, 471678, 2024-02-01 11:05:59.930214+00)] = [WaitsFor transaction numbers: 471676,471677]
2024-02-01 11:06:00.708 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.708 UTC [1912956] LOG:  [2024-02-01 11:06:00.708698+00] [DistributedTransactionId: (2, 471677, 2024-02-01 11:05:59.891461+00)] = [WaitsFor transaction numbers: 471676]
2024-02-01 11:06:00.708 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.708 UTC [1912956] LOG:  [2024-02-01 11:06:00.708948+00] [DistributedTransactionId: (2, 471676, 2024-02-01 11:05:59.815823+00)] = [WaitsFor transaction numbers: 362459,362463]
2024-02-01 11:06:00.708 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1912956] LOG:  [2024-02-01 11:06:00.709128+00] [DistributedTransactionId: (1, 362463, 2024-02-01 11:05:59.71205+00)] = [WaitsFor transaction numbers: 362459][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.709 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1912956] LOG:  [2024-02-01 11:06:00.709161+00] Cancelling the following backend to resolve distributed deadlock (transaction number = 362463, pid = 2014278)
2024-02-01 11:06:00.709 UTC [1912956] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [2014278] dbowner@db ERROR:  canceling the transaction since it was involved in a distributed deadlock
2024-02-01 11:06:00.709 UTC [2014278] dbowner@db STATEMENT:  UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)

Worker 2 log:


2024-02-01 11:06:00.052 UTC [2036734] dbowner@db ERROR:  deadlock detected
2024-02-01 11:06:00.052 UTC [2036734] dbowner@db DETAIL:  Process 2036734 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108890.
        Process 2108890 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2036734.
        Process 2036734: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
        Process 2108890: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
2024-02-01 11:06:00.052 UTC [2036734] dbowner@db HINT:  See server log for query details.
2024-02-01 11:06:00.052 UTC [2036734] dbowner@db STATEMENT:  UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.052 UTC [2108890] dbowner@db LOG:  duration: 1000.283 ms  statement: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
2024-02-01 11:06:00.058 UTC [2108872] dbowner@db LOG:  duration: 860.117 ms  statement: SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.230 UTC [2108878] dbowner@db ERROR:  deadlock detected
2024-02-01 11:06:00.230 UTC [2108878] dbowner@db DETAIL:  Process 2108878 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
        Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
        Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108878.
        Process 2108878: SELECT lock_shard_resources(7, ARRAY[102074])
        Process 2108872: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
        Process 2099179: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.230 UTC [2108878] dbowner@db HINT:  See server log for query details.
2024-02-01 11:06:00.230 UTC [2108878] dbowner@db STATEMENT:  SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.363 UTC [2108421] dbowner@db ERROR:  deadlock detected
2024-02-01 11:06:00.363 UTC [2108421] dbowner@db DETAIL:  Process 2108421 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
        Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
        Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108421.
        Process 2108421: SELECT lock_shard_resources(7, ARRAY[102074])
        Process 2108872: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
        Process 2099179: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.363 UTC [2108421] dbowner@db HINT:  See server log for query details.
2024-02-01 11:06:00.363 UTC [2108421] dbowner@db STATEMENT:  SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.555 UTC [2108875] dbowner@db ERROR:  deadlock detected
2024-02-01 11:06:00.555 UTC [2108875] dbowner@db DETAIL:  Process 2108875 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108872.
        Process 2108872 waits for ExclusiveLock on advisory lock [16388,0,102008,5]; blocked by process 2099179.
        Process 2099179 waits for ExclusiveLock on advisory lock [16388,0,102074,5]; blocked by process 2108875.
        Process 2108875: SELECT lock_shard_resources(7, ARRAY[102074])
        Process 2108872: SELECT lock_shard_resources(7, ARRAY[102008, 102308, 102745, 102747])
        Process 2099179: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)
2024-02-01 11:06:00.555 UTC [2108875] dbowner@db HINT:  See server log for query details.
2024-02-01 11:06:00.555 UTC [2108875] dbowner@db STATEMENT:  SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.708 UTC [1931701] LOG:  [2024-02-01 11:06:00.708931+00] Distributed deadlock found among the following distributed transactions:
2024-02-01 11:06:00.708 UTC [1931701] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1931701] LOG:  [2024-02-01 11:06:00.709062+00] [DistributedTransactionId: (2, 471678, 2024-02-01 11:05:59.930214+00)] = [WaitsFor transaction numbers: 471676,471677][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1931701] LOG:  [2024-02-01 11:06:00.709148+00] [DistributedTransactionId: (2, 471677, 2024-02-01 11:05:59.891461+00)] = [WaitsFor transaction numbers: 471676][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1931701] LOG:  [2024-02-01 11:06:00.709388+00] [DistributedTransactionId: (2, 471676, 2024-02-01 11:05:59.815823+00)] = [WaitsFor transaction numbers: 362459,362463][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [2108874] dbowner@db ERROR:  canceling statement due to user request
2024-02-01 11:06:00.709 UTC [2108874] dbowner@db STATEMENT:  SELECT lock_shard_resources(7, ARRAY[102074])
2024-02-01 11:06:00.709 UTC [1931701] LOG:  [2024-02-01 11:06:00.70964+00] [DistributedTransactionId: (1, 362463, 2024-02-01 11:05:59.71205+00)] = [WaitsFor transaction numbers: 362459]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.709 UTC [1931701] LOG:  [2024-02-01 11:06:00.709894+00] [DistributedTransactionId: (1, 362459, 2024-02-01 11:05:59.200145+00)] = [WaitsFor transaction numbers: 471676,471677,471678,471679]
2024-02-01 11:06:00.709 UTC [1931701] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.710 UTC [1931701] LOG:  [2024-02-01 11:06:00.709997+00] [DistributedTransactionId: (2, 471679, 2024-02-01 11:05:59.980572+00)] = [WaitsFor transaction numbers: 471676,471677,471678][Backend Query: UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)]
2024-02-01 11:06:00.710 UTC [1931701] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.710 UTC [1931701] LOG:  [2024-02-01 11:06:00.710028+00] Cancelling the following backend to resolve distributed deadlock (transaction number = 471679, pid = 2036746)
2024-02-01 11:06:00.710 UTC [1931701] CONTEXT:  Citus maintenance daemon for database 16388 user 10
2024-02-01 11:06:00.710 UTC [2036746] dbowner@db ERROR:  canceling the transaction since it was involved in a distributed deadlock
2024-02-01 11:06:00.710 UTC [2036746] dbowner@db STATEMENT:  UPDATE "table1" SET "offending_column" = $1 WHERE ("table1"."id" = $2)

As for disabling deadlock detection, it should be possible to do on current cluster but will have to schedule some time window for that and is likely to take a few days.

@JelteF
Copy link
Contributor

JelteF commented Feb 1, 2024

Hmm, okay the exclusive lock on the advisory lock suggests that this is related to the way we serialize updates to reference tables. @agedemenli can you take a closer look?

@agedemenli
Copy link
Contributor

create table table2 (id int primary key);
select create_reference_table('table2');
alter table table2 add constraint self_fkey foreign key (id) references table2(id);
insert into table2 values (1);

create table table3 (id int primary key);
SELECT create_reference_table('table3');
insert into table3 values (1);

CREATE TABLE public.table1 (
      id int8 NOT NULL GENERATED ALWAYS AS IDENTITY,
      column1 int2 NOT NULL,
      column2 int2 NOT NULL,
      offending_column timestamptz NULL,
      column3 uuid NULL,
      column4 uuid NULL,
      column5 int2 NOT NULL DEFAULT 0,
      column6 int8 NOT NULL,
      column7 int8 NULL,
      column8 int2 NOT NULL DEFAULT 0,
      column9 text NOT NULL DEFAULT '',
      column10 int8 NOT NULL,
      CONSTRAINT table1_pk PRIMARY KEY (id),
      CONSTRAINT table1_table2_fk FOREIGN KEY (column6) REFERENCES public.table2(id),
      CONSTRAINT table1_table3_fk FOREIGN KEY (column10) REFERENCES public.table3(id)
  );
  SELECT create_reference_table('public.table1');

insert into table1 values (default,1,1,null,'1740a945-d9a1-4bf9-ba48-50620b5109ed','1740
a945-d9a1-4bf9-ba48-50620b5109ed',1,1,1,1,'str',1);

I'll run this insert statement frequently for a while to see if I can reproduce it

@lkral-navmatix
Copy link
Author

lkral-navmatix commented Feb 5, 2024

After disabling deadlock detection, we can observe that there are indeed several transactions that are completely stuck. All of them have the same exact UPDATE (with different values).

Below is the output from pg_locks from both workers, filtered out to include only locks that have PID matching to the queries that are stuck.

locktype  |database|relation|page|tuple|virtualxid|transactionid|classid|objid |objsubid|virtualtransaction|pid    |mode            |granted|fastpath|waitstart                    |
----------+--------+--------+----+-----+----------+-------------+-------+------+--------+------------------+-------+----------------+-------+--------+-----------------------------+
relation  |   16388|   16518|    |     |          |             |       |      |        |48/211263         |2466353|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |48/211263         |2466353|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |48/211263         |2466353|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |48/211263 |             |       |      |        |48/211263         |2466353|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |62/75245          |2466365|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |62/75245          |2466365|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |62/75245          |2466365|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |62/75245  |             |       |      |        |62/75245          |2466365|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |50/283761         |2466356|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |50/283761         |2466356|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |50/283761         |2466356|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |50/283761 |             |       |      |        |50/283761         |2466356|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |39/144041         |2466348|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |39/144041         |2466348|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |39/144041         |2466348|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |39/144041 |             |       |      |        |39/144041         |2466348|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |68/68793          |2466369|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |68/68793          |2466369|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |68/68793          |2466369|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |68/68793  |             |       |      |        |68/68793          |2466369|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |58/91978          |2466358|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |58/91978          |2466358|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |58/91978          |2466358|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |58/91978  |             |       |      |        |58/91978          |2466358|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |47/431707         |2466352|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |47/431707         |2466352|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |47/431707         |2466352|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |47/431707 |             |       |      |        |47/431707         |2466352|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |77/5719           |2466373|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |77/5719           |2466373|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |77/5719           |2466373|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |77/5719   |             |       |      |        |77/5719           |2466373|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |40/362508         |2466349|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |40/362508         |2466349|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |40/362508         |2466349|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |40/362508 |             |       |      |        |40/362508         |2466349|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |63/336885         |2466366|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |63/336885         |2466366|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |63/336885         |2466366|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |63/336885 |             |       |      |        |63/336885         |2466366|ExclusiveLock   |true   |true    |                             |
relation  |   16388|   16518|    |     |          |             |       |      |        |72/42189          |2466370|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |72/42189          |2466370|AccessShareLock |true   |true    |                             |
relation  |   16388|   18918|    |     |          |             |       |      |        |72/42189          |2466370|RowExclusiveLock|true   |true    |                             |
virtualxid|        |        |    |     |72/42189  |             |       |      |        |72/42189          |2466370|ExclusiveLock   |true   |true    |                             |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|40/362508         |2466349|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102008|       5|68/68793          |2466369|ExclusiveLock   |false  |false   |2024-02-05 10:32:37.854 +0100|
advisory  |   16388|        |    |     |          |             |      0|102745|       5|39/144041         |2466348|ExclusiveLock   |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|58/91978          |2466358|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102008|       5|47/431707         |2466352|ExclusiveLock   |false  |false   |2024-02-05 10:32:27.837 +0100|
advisory  |   16388|        |    |     |          |             |      0|102008|       5|48/211263         |2466353|ExclusiveLock   |false  |false   |2024-02-05 10:33:04.702 +0100|
advisory  |   16388|        |    |     |          |             |      0|102074|       4|50/283761         |2466356|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|63/336885         |2466366|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102008|       5|39/144041         |2466348|ExclusiveLock   |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102008|       5|72/42189          |2466370|ExclusiveLock   |false  |false   |2024-02-05 10:32:32.847 +0100|
advisory  |   16388|        |    |     |          |             |      0|102008|       5|77/5719           |2466373|ExclusiveLock   |false  |false   |2024-02-05 10:32:22.824 +0100|
advisory  |   16388|        |    |     |          |             |      0|102074|       4|62/75245          |2466365|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102008|       5|62/75245          |2466365|ExclusiveLock   |false  |false   |2024-02-05 10:32:47.873 +0100|
advisory  |   16388|        |    |     |          |             |      0|102074|       4|77/5719           |2466373|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102074|       5|39/144041         |2466348|ExclusiveLock   |false  |false   |2024-02-05 10:33:02.948 +0100|
advisory  |   16388|        |    |     |          |             |      0|102074|       4|72/42189          |2466370|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102747|       5|39/144041         |2466348|ExclusiveLock   |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102008|       5|63/336885         |2466366|ExclusiveLock   |false  |false   |2024-02-05 10:32:42.862 +0100|
advisory  |   16388|        |    |     |          |             |      0|102308|       5|39/144041         |2466348|ExclusiveLock   |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|39/144041         |2466348|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|47/431707         |2466352|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|48/211263         |2466353|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102008|       5|50/283761         |2466356|ExclusiveLock   |false  |false   |2024-02-05 10:32:57.900 +0100|
advisory  |   16388|        |    |     |          |             |      0|102008|       5|58/91978          |2466358|ExclusiveLock   |false  |false   |2024-02-05 10:32:52.882 +0100|
advisory  |   16388|        |    |     |          |             |      0|102074|       4|68/68793          |2466369|ShareLock       |true   |false   |                             |
advisory  |   16388|        |    |     |          |             |      0|102008|       5|40/362508         |2466349|ExclusiveLock   |false  |false   |2024-02-05 10:32:17.813 +0100|
locktype  |database|relation|page|tuple|virtualxid|transactionid|classid|objid |objsubid|virtualtransaction|pid    |mode            |granted|fastpath|waitstart|
----------+--------+--------+----+-----+----------+-------------+-------+------+--------+------------------+-------+----------------+-------+--------+---------+
relation  |   16388|   16518|    |     |          |             |       |      |        |63/99368          |2412085|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |63/99368          |2412085|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |63/99368          |2412085|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |63/99368          |2412085|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |63/99368  |             |       |      |        |63/99368          |2412085|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16596|    |     |          |             |       |      |        |53/135275         |2412077|AccessShareLock |true   |true    |         |
relation  |   16388|   16455|    |     |          |             |       |      |        |53/135275         |2412077|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |53/135275         |2412077|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |53/135275         |2412077|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |53/135275         |2412077|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |53/135275         |2412077|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |53/135275 |             |       |      |        |53/135275         |2412077|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |56/340870         |2412080|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |56/340870         |2412080|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |56/340870         |2412080|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |56/340870         |2412080|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |56/340870 |             |       |      |        |56/340870         |2412080|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |60/198745         |2412083|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |60/198745         |2412083|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |60/198745         |2412083|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |60/198745         |2412083|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |60/198745 |             |       |      |        |60/198745         |2412083|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |65/111224         |2412086|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |65/111224         |2412086|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |65/111224         |2412086|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |65/111224         |2412086|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |65/111224 |             |       |      |        |65/111224         |2412086|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |68/81351          |2412087|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |68/81351          |2412087|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |68/81351          |2412087|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |68/81351          |2412087|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |68/81351  |             |       |      |        |68/81351          |2412087|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |44/136070         |2411900|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |44/136070         |2411900|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |44/136070         |2411900|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |44/136070         |2411900|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |44/136070 |             |       |      |        |44/136070         |2411900|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |54/346888         |2412078|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |54/346888         |2412078|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |54/346888         |2412078|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |54/346888         |2412078|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |54/346888 |             |       |      |        |54/346888         |2412078|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |58/108409         |2412081|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |58/108409         |2412081|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |58/108409         |2412081|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |58/108409         |2412081|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |58/108409 |             |       |      |        |58/108409         |2412081|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |37/207814         |2411853|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |37/207814         |2411853|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |37/207814         |2411853|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |37/207814         |2411853|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |37/207814 |             |       |      |        |37/207814         |2411853|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |59/106674         |2412082|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |59/106674         |2412082|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |59/106674         |2412082|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |59/106674         |2412082|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |59/106674 |             |       |      |        |59/106674         |2412082|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |69/71340          |2412088|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |69/71340          |2412088|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |69/71340          |2412088|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |69/71340          |2412088|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |69/71340  |             |       |      |        |69/71340          |2412088|ExclusiveLock   |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |55/341694         |2412079|AccessShareLock |true   |true    |         |
relation  |   16388|   16518|    |     |          |             |       |      |        |55/341694         |2412079|RowShareLock    |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |55/341694         |2412079|AccessShareLock |true   |true    |         |
relation  |   16388|   18918|    |     |          |             |       |      |        |55/341694         |2412079|RowExclusiveLock|true   |true    |         |
virtualxid|        |        |    |     |55/341694 |             |       |      |        |55/341694         |2412079|ExclusiveLock   |true   |true    |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|59/106674         |2412082|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|58/108409         |2412081|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|69/71340          |2412088|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|53/135275         |2412077|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|68/81351          |2412087|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|55/341694         |2412079|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|37/207814         |2411853|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|56/340870         |2412080|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|54/346888         |2412078|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|44/136070         |2411900|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|60/198745         |2412083|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|63/99368          |2412085|ShareLock       |true   |false   |         |
advisory  |   16388|        |    |     |          |             |      0|102074|       4|65/111224         |2412086|ShareLock       |true   |false   |         |

@agedemenli
Copy link
Contributor

Thanks for the lock information!

I still can't reproduce the issue and the best way to debug it is actually first reproducing it. Any detail that helps with the reproduction would be appreciated. Could you please take a look at the schema snippet I shared above? Just in case you have an improvement suggestion to make it work.

@lkral-navmatix
Copy link
Author

Unfortunately I don't have more that could help with reproduction right now, as I am unable to reproduce this on an another cluster with the same schema either.

But I have looked into the locks that are leading into the deadlock and one thing that seems very suspicious are the advisory ExclusiveLocks. Specifically, they lock not only the entire table2 (for which there is no point AFAIK, because we aren't touching that table in any way), but also other tables that have foreign keys in table2 (simple reference tables that represent enums as mentioned on Slack).

Assuming I understand these types of locks correctly, the simple UPDATE leads to table level locks on unrelated tables which allow only operations originating from that transaction (not taking into account that locks are only advisory, so not sure how Citus interacts with them).

Could you confirm that this is NOT the intended locking behavior?
Is there a way to extract information from Citus that could explain why these locks are taken? Perhaps this information would allow us to better reproduce this issue.

@lkral-navmatix
Copy link
Author

It would be possible to grant an access to the faulty cluster temporarily if it would help with investigation of this issue. Let me know if this is something worth doing or if local reproduction is needed to continue.

@JelteF
Copy link
Contributor

JelteF commented Feb 15, 2024

@lkral-navmatix

Okay we were able to reproduce locally and find a minimal reproducing example
(see below, needs PG16 for load_balance_hosts in pgbench). So no need for
access to your cluster. This is definitely a bug. We'll assign someone to fix
it, but depending on the difficulty that might take a while. Luckily there is a
pretty simple workaround.

How to workaround

Don't run this problematic update query through your loadbalancer, but only
run it on a specific single node (coordinator is probably easiest, but any of
the workers would be fine too).

How to reproduce

Create two files

issue-7477.sql:

create table table2 (id int primary key);
select create_reference_table('table2');
insert into table2 values (1);

CREATE TABLE public.table1 (
      id int8 NOT NULL,
      offending_column timestamptz NULL,
      column6 int8 NOT NULL,
      CONSTRAINT table1_table2_fk FOREIGN KEY (column6) REFERENCES public.table2(id)
  );
  SELECT create_reference_table('public.table1');

insert into table1 values (1,null,1);

update.sql:

UPDATE table1 SET offending_column = '2024-01-31T00:00:00Z' where id = 1;

Then run:

./citus_dev make test --destroy --init-with issue-7477.sql
 pgbench -f update.sql -c 10 -j 2 -T 200 -P 1 'host=localhost,localhost port=9701,9702 load_balance_hosts=random'

Worker logs are then full of deadlock issues.

@lkral-navmatix
Copy link
Author

@JelteF

That is great to hear, good job!

We will verify whether the workaround indeed works for us and will await future version with the fix.

Thank you for the cooperation.

@lkral-navmatix
Copy link
Author

I can confirm that we have not experienced any more deadlocks since applying the workaround.

Thank you.

@eaydingol
Copy link
Contributor

The order in which the locks are acquired depends on the worker initiating the UPDATE statement (first worker or others), which causes the deadlock.

eaydingol added a commit that referenced this issue Mar 10, 2024
This PR changes the order in which the locks are acquired (for the
target and reference tables), when a modify request is initiated from a
worker node that is not the "FirstWorkerNode".


To prevent concurrent writes, locks are acquired on the first worker
node for the replicated tables. When the update statement originates
from the first worker node, it acquires the lock on the reference
table(s) first, followed by the target table(s). However, if the update
statement is initiated in another worker node, the lock requests are
sent to the first worker in a different order. This PR unifies the
modification order on the first worker node. With the third commit,
independent of the node that received the request, the locks are
acquired for the modified table and then the reference tables on the
first node.

The first commit shows a sample output for the test prior to the fix. 

Fixes #7477

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
emelsimsek pushed a commit that referenced this issue Mar 11, 2024
This PR changes the order in which the locks are acquired (for the
target and reference tables), when a modify request is initiated from a
worker node that is not the "FirstWorkerNode".


To prevent concurrent writes, locks are acquired on the first worker
node for the replicated tables. When the update statement originates
from the first worker node, it acquires the lock on the reference
table(s) first, followed by the target table(s). However, if the update
statement is initiated in another worker node, the lock requests are
sent to the first worker in a different order. This PR unifies the
modification order on the first worker node. With the third commit,
independent of the node that received the request, the locks are
acquired for the modified table and then the reference tables on the
first node.

The first commit shows a sample output for the test prior to the fix. 

Fixes #7477

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
JelteF pushed a commit that referenced this issue Apr 16, 2024
This PR changes the order in which the locks are acquired (for the
target and reference tables), when a modify request is initiated from a
worker node that is not the "FirstWorkerNode".

To prevent concurrent writes, locks are acquired on the first worker
node for the replicated tables. When the update statement originates
from the first worker node, it acquires the lock on the reference
table(s) first, followed by the target table(s). However, if the update
statement is initiated in another worker node, the lock requests are
sent to the first worker in a different order. This PR unifies the
modification order on the first worker node. With the third commit,
independent of the node that received the request, the locks are
acquired for the modified table and then the reference tables on the
first node.

The first commit shows a sample output for the test prior to the fix.

Fixes #7477

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
(cherry picked from commit 8afa2d0)
JelteF pushed a commit that referenced this issue Apr 16, 2024
This PR changes the order in which the locks are acquired (for the
target and reference tables), when a modify request is initiated from a
worker node that is not the "FirstWorkerNode".

To prevent concurrent writes, locks are acquired on the first worker
node for the replicated tables. When the update statement originates
from the first worker node, it acquires the lock on the reference
table(s) first, followed by the target table(s). However, if the update
statement is initiated in another worker node, the lock requests are
sent to the first worker in a different order. This PR unifies the
modification order on the first worker node. With the third commit,
independent of the node that received the request, the locks are
acquired for the modified table and then the reference tables on the
first node.

The first commit shows a sample output for the test prior to the fix.

Fixes #7477

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
(cherry picked from commit 8afa2d0)
JelteF pushed a commit that referenced this issue Apr 16, 2024
This PR changes the order in which the locks are acquired (for the
target and reference tables), when a modify request is initiated from a
worker node that is not the "FirstWorkerNode".

To prevent concurrent writes, locks are acquired on the first worker
node for the replicated tables. When the update statement originates
from the first worker node, it acquires the lock on the reference
table(s) first, followed by the target table(s). However, if the update
statement is initiated in another worker node, the lock requests are
sent to the first worker in a different order. This PR unifies the
modification order on the first worker node. With the third commit,
independent of the node that received the request, the locks are
acquired for the modified table and then the reference tables on the
first node.

The first commit shows a sample output for the test prior to the fix.

Fixes #7477

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
(cherry picked from commit 8afa2d0)
JelteF pushed a commit that referenced this issue Apr 17, 2024
This PR changes the order in which the locks are acquired (for the
target and reference tables), when a modify request is initiated from a
worker node that is not the "FirstWorkerNode".

To prevent concurrent writes, locks are acquired on the first worker
node for the replicated tables. When the update statement originates
from the first worker node, it acquires the lock on the reference
table(s) first, followed by the target table(s). However, if the update
statement is initiated in another worker node, the lock requests are
sent to the first worker in a different order. This PR unifies the
modification order on the first worker node. With the third commit,
independent of the node that received the request, the locks are
acquired for the modified table and then the reference tables on the
first node.

The first commit shows a sample output for the test prior to the fix.

Fixes #7477

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
(cherry picked from commit 8afa2d0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants