[YSQL] Txn timeouts during large txns such as table rewrites

Jira Link: [DB-17288](https://yugabyte.atlassian.net/browse/DB-17288)

During large txns, such as those that result from table rewrites on large tables or a large partition hierarchy, we run into errors such as

 

`ysqlsh:alter_table.sql:1: ERROR:  could not serialize access due to concurrent update (query layer retry isn't possible, READ COMMITTED transaction was aborted  and some data was already sent to the user) DETAIL:  Heartbeat: Transaction 73389530-f34a-49e0-82f4-c408cfc6f770 expired or aborted by a conflict: YB001: . Errors from tablet servers: [Operation expired (yb/tablet/transaction_coordinator.cc:1766): Heartbeat: Transaction 73389530-f34a-49e0-82f4-c408cfc6f770 expired or aborted by a conflict: YB001 (pgsql error YB001) (transaction error 1)]`

 

One way to simulate this is to trigger RAFT leader failures while a table rewrite is running. It can be fixed by increasing the txn timeout via --transaction_max_missed_heartbeat_periods=60 but it would be better to increase this timeouts automatically for such txns.

[DB-17288]: https://yugabyte.atlassian.net/browse/DB-17288?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[YSQL] Txn timeouts during large txns such as table rewrites #27688

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[YSQL] Txn timeouts during large txns such as table rewrites #27688

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions