Skip to content

2.25.2.0-b138

@andrei-mart andrei-mart tagged this 11 Mar 14:29
Summary:
It used to be a soft limit, when set, DocDB routinely exceeded it.
DocDB can not reliably estimate data size of the currently iterated
row. So it checks the size after the row is written into the write
buffer, and stops iteration if the size is at or over the limit.
Obviously, as rows are multi-bytes, probability of size limit to be
exceeded rather than exactly met is high.

To make sure the limit is not exceeded, we are adding a function to
restore the write buffer position. Functionality to save current
position already existed in the write buffer. So now, we save the
write buffer position each time before writing a row, and if after
writing the size limit is exceeded, we restore the position.

Second problem to solve to support hard size limit was with the
returning of the paging info. Normally DocDB returns the key of the
next record. If it has exceeded the size limit and reverted the
write, it should return the key of the current record. The existing
function GetNextReadSubDocKey() of iterator did, in fact, two
operation: advance the iterator to the next position and take the
key of the current record. This diff refactors the second operation
into separate function, which could be called separately.

We now prefer functions to return Result with a value over output
parameter. Hence we change the definition of the GetNextReadSubDocKey
function accordingly.

There is a corner case when we still allow to exceed the size limit.
When the size of a single row is larger than the limit, we allow it
to be exceeded. Reverting to initial position and returning empty row
set would create an infinite loop, so we have to either go over the
limit or error out. The former seems better, as it tolerates the
yb_fetch_size_limit set too low by mistake. If the row is indeed too
long, it still may hit other limit, such as rpc_max_message_size and
error out.

The change affects existing fetch limit tests. In general, numbers of
request and rows read are higher, because if the size limit is exceeded
the response is smaller, and the last scanned row has to be rescanned
by the next page request. The expected results are updated accordingly.
Also, the yb_fetch_limits test is cleaned up. Now with or without
pg_hint_plan `SeqScan` directive the plan node for the sequential scan
is the same. So the duplicate queries were removed.
Jira: DB-13684

Test Plan:
./yb_build.sh --cxx-test util_write_buffer-test --gtest_filter WriteBufferTest.Truncate
./yb_build.sh --cxx-test util_write_buffer-test --gtest_filter WriteBufferTest.RandomTruncate
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbFetchLimits#testPgRegressYbFetchLimits'
./yb_build.sh --cxx-test pgwrapper_pg_mini-test --gtest_filter 'PgMiniTest.ReadHugeRows'

Reviewers: esheng, sergei, tnayak

Reviewed By: sergei, tnayak

Subscribers: ybase, ycdcxcluster, smishra, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D39706
Assets 2
Loading