cloud_storage: Fix timequery bug that triggers full scan #16503

Lazin · 2024-02-06T20:22:24Z

Timequery can trigger full partition scan if the query overshoots the last segment. In this case if log_reader_config.start_offset is below the first segment in the manifest and log_reader_config.max_offset is above last offset in the manifest we will initialize the scan from the beginning which is very expensive.

This PR adds

New partition level metrics which are used to detect skipped bytes generated by futile partition scan. This metrics are not exposed to metrics endpoint and only used internally (mostly for testing and will be used for alerts).
New test suite in the remote_partition_test that performs timequery (TODO: add fuzz test).
One line fix for the bug.

Fixes #16479

Backports Required

Release Notes

Bug Fixes

Fix timequery error that triggered full partition scan

Use timestamp to generate segments.

Add new overload of the 'scan_remote_partition' function used in tests. This overload uses timequery. Also, add timestamps to the metadata.

Add extra fields without exposing them to the metrics endpoint. The fields are bytes_skipped and bytes_accepted. Also, add accessor methods so the probe values could be read programmatically.

Update bytes_skip when the batches are skipped in the segment reader. Update bytes_accept when the batches are accepted.

The new test uses remote_partition test code but performs timequery. It reproduces the issue from 16479 among other things.

vbotbuildovich · 2024-02-06T22:33:32Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44773#018d8045-7e81-4f48-81f3-a017d7715722

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44773#018d806c-61f5-4915-a249-5045ba6164c6

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44812#018d83f2-b0e3-491f-87a6-7e51fb2da7af

vbotbuildovich · 2024-02-06T23:17:55Z

new failures in https://buildkite.com/redpanda/redpanda/builds/44773#018d806c-61f2-4b10-a0ab-6e1faf16937f:

"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_cross_shard.num_to_upgrade=2.cloud_storage_type=CloudStorageType.S3"

dotnwat · 2024-02-07T04:01:47Z

src/v/cloud_storage/remote_partition.cc

            auto so = manifest.get_start_kafka_offset().value_or(
              kafka::offset::min());
            if (
-              model::offset_cast(config.start_offset) < so
+              config.first_timestamp.has_value() == false


what is this part of the conjunction?

config.first_timestamp.has_value() == false

is that referring to this from the commit message?

didn't find the segment

it checks if the timequery is used, the first_timestamp is set in this case

src/v/cloud_storage/remote_partition.cc

abhijat

lgtm other than minor suggestion

The remote_partition_reader can reset to the begining of the partition in case if it didn't find the segment and both config.start_offset and config.max_offset are outside of the manifest. When this logic is applied to the timequery in case if timestamp overshoots last segment in the manifest we end up having a full partition scan. This commit fixes this by disabling this logic for timequeries.

vbotbuildovich · 2024-02-07T16:11:05Z

/backport v23.3.x

vbotbuildovich · 2024-02-07T16:11:06Z

/backport v23.2.x

vbotbuildovich · 2024-02-07T16:12:01Z

Failed to create a backport PR to v23.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-16503-v23.2.x-950 remotes/upstream/v23.2.x
git cherry-pick -x fff0d142ff856c5fce6e03231a5bbccc6d0fd1f1 6d18fadce66bcdc61b8ade6645a7fce36117893e 04ac537fa0df32bcd91424c7a50cbb4e703239dc adefd4cc22de0954bfe941005ae06e82b0e690d7 f9a9afe1a55d6dc51696d5880ab6fc047fe807c9 f4f67432587d6e7e8ae4369c414462622851e211

Workflow run logs.

Lazin added 5 commits February 6, 2024 14:52

cloud_storage: Add timestamp to the batch_t (tests)

fff0d14

Use timestamp to generate segments.

cloud_storage: Add scan_remote_partition overload

6d18fad

Add new overload of the 'scan_remote_partition' function used in tests. This overload uses timequery. Also, add timestamps to the metadata.

cloud_storage: Improve partition_probe

04ac537

Add extra fields without exposing them to the metrics endpoint. The fields are bytes_skipped and bytes_accepted. Also, add accessor methods so the probe values could be read programmatically.

cloud_storage: Update skip/accept partition metrics

adefd4c

Update bytes_skip when the batches are skipped in the segment reader. Update bytes_accept when the batches are accepted.

cloud_storage: Add new timequery test

f9a9afe

The new test uses remote_partition test code but performs timequery. It reproduces the issue from 16479 among other things.

Lazin requested a review from andijcr February 6, 2024 20:22

github-actions bot added the area/redpanda label Feb 6, 2024

Lazin requested a review from abhijat February 6, 2024 20:22

dotnwat reviewed Feb 7, 2024

View reviewed changes

abhijat reviewed Feb 7, 2024

View reviewed changes

src/v/cloud_storage/remote_partition.cc Outdated Show resolved Hide resolved

abhijat previously approved these changes Feb 7, 2024

View reviewed changes

piyushredpanda added this to the v23.3.5 milestone Feb 7, 2024

Lazin dismissed abhijat’s stale review via f4f6743 February 7, 2024 12:47

Lazin force-pushed the pr/bytes-scanned-bytes-consumed branch from 0741a47 to f4f6743 Compare February 7, 2024 12:47

Lazin requested review from abhijat and dotnwat February 7, 2024 12:47

abhijat approved these changes Feb 7, 2024

View reviewed changes

Lazin merged commit 7ed235b into redpanda-data:dev Feb 7, 2024
17 checks passed

This was referenced Feb 7, 2024

[v23.2.x] cloud_storage: Fix timequery bug that triggers full scan #16516

Closed

[v23.3.x] cloud_storage: Timequery starts from the beginning of the read-replica #16517

Closed

[v23.3.x] cloud_storage: Fix timequery bug that triggers full scan #16518

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cloud_storage: Fix timequery bug that triggers full scan #16503

cloud_storage: Fix timequery bug that triggers full scan #16503

Lazin commented Feb 6, 2024

vbotbuildovich commented Feb 6, 2024 •

edited

vbotbuildovich commented Feb 6, 2024

dotnwat Feb 7, 2024

Lazin Feb 7, 2024

abhijat left a comment

vbotbuildovich commented Feb 7, 2024

vbotbuildovich commented Feb 7, 2024

vbotbuildovich commented Feb 7, 2024

cloud_storage: Fix timequery bug that triggers full scan #16503

cloud_storage: Fix timequery bug that triggers full scan #16503

Conversation

Lazin commented Feb 6, 2024

Backports Required

Release Notes

Bug Fixes

vbotbuildovich commented Feb 6, 2024 • edited

vbotbuildovich commented Feb 6, 2024

dotnwat Feb 7, 2024

Choose a reason for hiding this comment

Lazin Feb 7, 2024

Choose a reason for hiding this comment

abhijat left a comment

Choose a reason for hiding this comment

vbotbuildovich commented Feb 7, 2024

vbotbuildovich commented Feb 7, 2024

vbotbuildovich commented Feb 7, 2024

vbotbuildovich commented Feb 6, 2024 •

edited