Summary:
DDL atomicity works by having a background task to poll the transaction status
tablet to find out the status of the DDL transaction. Once the status indicates
the DDL transaction has completed, the background task will roll the DocDB part
of the DDL transaction forward if the DDL transaction has committed, or roll the
DocDB part of the DDL transaction backward if the DDL transaction has aborted.
However the background task itself fails and exits when the transaction status
tablet polling RPC fails. In that case the DocDB part of the DDL transaction
simply hangs there until a next DDL statement that touches the same table
restart a new background task. If there isn't a next DDL statement coming to
retrigger a new background task, the table remains in DDL verification state
indefinitely. This can cause a backup to fail because of the following like
error:
```
Table 0000410300003000800000000001090c is undergoing DDL verification, retry later
```
To fix the bug, I made a change so that the background task does not fail and
exit on a transaction status polling RPC failure. The background task will retry
the RPC after a delay (default 200ms).
A new unit test is added which would fail before this diff.
Jira: DB-14965
Test Plan: ./yb_build.sh --cxx-test pg_ddl_atomicity-test --gtest_filter PgDdlAtomicityTest.TestPollTransactionFuilure
Reviewers: hsunder
Reviewed By: hsunder
Subscribers: ybase, yql
Differential Revision: https://phorge.dev.yugabyte.com/D41388