Skip to content

2.25.0.0-b277

@pao214 pao214 tagged this 08 Nov 16:40
Summary:
### Issue

DocDB assumes that the global limit is always greater than or equal to the read time. For example, DocDB skips all records greater than the global limit. This is incorrect if read time = 100 and global limit = 50. We should not skip records from 50 to 100. These records are skipped conditionally depending on the value of local limit, leading to an inconsistent read.

```
      auto max_allowed = value.compare(encoded_read_time_.local_limit.AsSlice()) > 0
          ? encoded_read_time_.global_limit.AsSlice()
          : encoded_read_time_.read.AsSlice();
```

However, requiring that read_time <= global_limit is reasonable and the call site must ensure this condition when constructing read time.

read_time can go past global_limit when the picked restart time is higher than the global limit. This can happen if restart time is picked as safe time on a node and that time is higher than global limit. This scenario is more likely to be triggered in low global limit scenarios, for example, when time_source=clockbound.

### Call sites

1. read_query.cc:FormRestartReadHybridTime
2. write_query.cc:DoCompleteExecute
3. sys_catalog.cc:ReadWithRestarts

1. ReadQuery::FormRestartReadHybridTime already prevents read time from going above global limit.
2. WriteQuery never uses safe time as restart time. It always picks RestartReadHt from the iterator. RestartReadHt returns max seen hybrid time that is within the uncertainty window, i.e. max_seen_ht < global_limit by construction.

SysCatalog can run into this issue. This revision extracts the logic from ReadQuery::FormRestartReadHybridTime to use it in SysCatalog::ReadWithRestarts.

### Issue first observed in

When restart read time exceeds global limit, the following check fails in debug and debug adjacent builds.

```name=running_transaction.cc
DCHECK_LE(request.read_ht, request.global_limit_ht);
```

Example error found in `PgLibPqTest.StaleMasterReads`

```
I0913 18:30:06.179769  9878 pg_libpq-test.cc:985] Fetching CatalogVersion. Expecting 2696
[m-1] I0913 18:30:06.180251  9926 sys_catalog.cc:929] vlog0: ReadWithRestarts restarting read with ht = { days: 19979 time: 18:30:06.177077 logical: 4095 } >= { days: 19979 time: 18:30:06.174689 }. Encountered read restart when reading at { read: { days: 19979 time: 18:30:06.174524 } local_limit: { days: 19979 time: 18:30:06.174524 } global_limit: { days: 19979 time: 18:30:06.176028 } in_txn_limit: <invalid> serial_no: 0 }
[m-1] F0913 18:30:06.180785  9926 running_transaction.cc:117] Check failed: request.read_ht <= request.global_limit_ht ({ days: 19979 time: 18:30:06.177077 logical: 4095 } vs. { days: 19979 time: 18:30:06.176028 })
[m-1] Fatal failure details written to ${BUILD_ROOT}/yb-test-logs/tests-pgwrapper__pg_libpq-test/PgLibPqTest_StaleMasterReads.fatal_failure_details.m-1.2024-09-13T18_30_06.pid9885.txt
[m-1] F20240913 18:30:06 ../../src/yb/tablet/running_transaction.cc:117] Check failed: request.read_ht <= request.global_limit_ht ({ days: 19979 time: 18:30:06.177077 logical: 4095 } vs. { days: 19979 time: 18:30:06.176028 })
```
Jira: DB-12903

Test Plan:
Jenkins

Added test MasterRestartReadPastGlobalLimit

```
./yb_build.sh --cxx-test pg_libpq-test --gtest_filter PgLibPqTest.MasterRestartReadPastGlobalLimit
```

Also, ensure that

```
./yb_build.sh --cxx-test pg_libpq-test --gtest_filter PgLibPqTest.StaleMasterReads -n 50
```

runs with `time_source=clockbound`.

Backport-through: 2024.2

Reviewers: pjain, amitanand, sergei

Reviewed By: pjain, sergei

Subscribers: yql, smishra, svc_phabricator, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D38231
Assets 2
Loading