Summary:
### Issue
Create table
```
CREATE TABLE keys(k1 INT, k2 INT, PRIMARY KEY(k1 ASC, k2 ASC));
```
The following batched Insert on conflict will cause a read restart error
```
INSERT INTO keys(k1, k2) VALUES (0, 0), (9, 9) ON CONFLICT DO NOTHING
```
even when there is a concurrent write on a disjoint key
```
INSERT INTO keys(k1, k2) VALUES (1, 1)
```
### Root Cause
At a high level, a read operation can receive a read restart error on an absent key since the docdb iterator can traverse rows irrelevant to the scan condition. These rows can fall within the read time uncertainty interval causing a read restart. The issue exists because read restart detection happens at a lower layer of abstraction than row filtering.
#### Iterator Abstraction
DocDB reads go through multiple layers of iterators each having a separate concern. Listed notable iterators from Lower layers of abstraction to upper layers.
1. rocksdb::Iterator: traverses a rocksdb instance.
2. IntentAwareIterator: combines intentsdb and regulardb (both are instances of rocksdb) and provides a unified interface on top. Also, detects writes within the uncertainty interval.
3. DocRowwiseIterator: filters out rows irrelevant to the Index Scan. Uses scan choices to do this. Also, skips tombstones.
Since the rows are filtered in DocRowwiseIterator and not IntentAwareIterator where uncertainty writes are detected, read restart errors are raised on absent keys.
#### Detect writes within the uncertainty interval
Intent Aware Iterator detects this using max_seen_ht. The iterator skips all the rows with a timestamp higher than the global limit (this logic is a bit more complicated with local limit). The iterator then records the maximum timestamp of all rows observed so far. Notice that this must be lower than the global limit. This is recorded in max_seen_ht. If max_seen_ht is higher than the read time, there is a row within the uncertainty interval (as observed by intent aware iterator) causing a read restart error.
#### Example Control Flow
The following docdb read request is sent by the above insert on conflict example
```
I0103 19:04:06.112418 2088209 pg_session.cc:326] Session id 3: Applying operation: { READ active: 1 read_time: { read: <invalid> local_limit: <invalid> global_limit: <invalid> in_txn_limit: <invalid> serial_no: 0 } request: client: YQL_CLIENT_PGSQL stmt_id: 6300708499712 schema_version: 0 targets { column_id: -8 } targets { column_id: 0 } targets { column_id: 1 } column_refs { ids: 0 ids: 1 } is_forward_scan: 1 is_aggregate: 0 limit: 1024 return_paging_state: 1 table_id: "000034cb000030008000000000004000" condition_expr { condition { op: QL_OP_AND operands { condition { op: QL_OP_IN operands { tuple { elems { column_id: 0 } elems { column_id: 1 } } } operands { value { list_value { elems { tuple_value { elems { int32_value: 0 } elems { int32_value: 0 } } } elems { tuple_value { elems { int32_value: 9 } elems { int32_value: 9 } } } } } } } } } } upper_bound { key: "488000000948800000097E21" is_inclusive: 1 } col_refs { column_id: 0 attno: 1 } col_refs { column_id: 1 attno: 2 } partition_key: "4880000000488000000021" ysql_db_catalog_version: 1 ysql_db_oid: 13515 metrics_capture: PGSQL_METRICS_CAPTURE_NONE size_limit: 0 }
```
Hybrid Scan choices is created with condition `(k1, k2) IN (0, 0), (9, 9)`. Scan choices looks at row (1, 1) and determines that it is not interested in the row. However, intent aware iterator does not know that scan choices will reject the row (1, 1) and updates max_seen_ht anyway.
### Proposal
To mitigate the above problem, intent aware iterator should let upper layers rollback the last seen hybrid time. So, whenever a row is being filtered out by DocRowwiseIterator (or ScanChoices), the iterator can rollback the last seen ht. This rollback behavior is gated under the flag `disable_last_seen_ht_rollback`.
Jira: DB-14401
Test Plan:
Jenkins
1. Iterator level test
```
./yb_build.sh --cxx-test docrowwiseiterator-test --gtest_filter DocRowwiseIteratorTest.NoHtSeenOnAbsentKeys
./yb_build.sh --cxx-test docrowwiseiterator-test --gtest_filter DocRowwiseIteratorTest.NoHtSeenOnInvisibleKeys
./yb_build.sh --cxx-test docrowwiseiterator-test --gtest_filter DocRowwiseIteratorTest.HtSeenOnDeletedKeys
./yb_build.sh --cxx-test docrowwiseiterator-test --gtest_filter DocRowwiseIteratorTest.ExhaustiveHtRollbackTest
```
2. Insert on conflict test
```
./yb_build.sh --cxx-test pg_last_seen_ht_rollback-test
```
Backport-through: 2024.2
Reviewers: pjain, sergei, smishra
Reviewed By: sergei
Subscribers: yql, ybase
Differential Revision: https://phorge.dev.yugabyte.com/D40558