Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud_storage: Fix timequery bug that triggers full scan #16503

Merged
merged 6 commits into from
Feb 7, 2024

Conversation

Lazin
Copy link
Contributor

@Lazin Lazin commented Feb 6, 2024

Timequery can trigger full partition scan if the query overshoots the last segment. In this case if log_reader_config.start_offset is below the first segment in the manifest and log_reader_config.max_offset is above last offset in the manifest we will initialize the scan from the beginning which is very expensive.

This PR adds

  • New partition level metrics which are used to detect skipped bytes generated by futile partition scan. This metrics are not exposed to metrics endpoint and only used internally (mostly for testing and will be used for alerts).
  • New test suite in the remote_partition_test that performs timequery (TODO: add fuzz test).
  • One line fix for the bug.

Fixes #16479

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x
  • v23.1.x

Release Notes

Bug Fixes

  • Fix timequery error that triggered full partition scan

Use timestamp to generate segments.
Add new overload of the 'scan_remote_partition' function used in tests.
This overload uses timequery. Also, add timestamps to the metadata.
Add extra fields without exposing them to the metrics endpoint. The
fields are bytes_skipped and bytes_accepted. Also, add accessor methods
so the probe values could be read programmatically.
Update bytes_skip when the batches are skipped in the segment reader.
Update bytes_accept when the batches are accepted.
The new test uses remote_partition test code but performs timequery.
It reproduces the issue from 16479 among other things.
@vbotbuildovich
Copy link
Collaborator

new failures in https://buildkite.com/redpanda/redpanda/builds/44773#018d806c-61f2-4b10-a0ab-6e1faf16937f:

"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_cross_shard.num_to_upgrade=2.cloud_storage_type=CloudStorageType.S3"

auto so = manifest.get_start_kafka_offset().value_or(
kafka::offset::min());
if (
model::offset_cast(config.start_offset) < so
config.first_timestamp.has_value() == false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this part of the conjunction?

config.first_timestamp.has_value() == false

is that referring to this from the commit message?

didn't find the segment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it checks if the timequery is used, the first_timestamp is set in this case

abhijat
abhijat previously approved these changes Feb 7, 2024
Copy link
Contributor

@abhijat abhijat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm other than minor suggestion

@piyushredpanda piyushredpanda added this to the v23.3.5 milestone Feb 7, 2024
The remote_partition_reader can reset to the begining of the partition
in case if it didn't find the segment and both config.start_offset and
config.max_offset are outside of the manifest. When this logic is
applied to the timequery in case if timestamp overshoots last segment in
the manifest we end up having a full partition scan. This commit fixes
this by disabling this logic for timequeries.
@Lazin Lazin merged commit 7ed235b into redpanda-data:dev Feb 7, 2024
17 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-16503-v23.2.x-950 remotes/upstream/v23.2.x
git cherry-pick -x fff0d142ff856c5fce6e03231a5bbccc6d0fd1f1 6d18fadce66bcdc61b8ade6645a7fce36117893e 04ac537fa0df32bcd91424c7a50cbb4e703239dc adefd4cc22de0954bfe941005ae06e82b0e690d7 f9a9afe1a55d6dc51696d5880ab6fc047fe807c9 f4f67432587d6e7e8ae4369c414462622851e211

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cloud_storage: Timequery starts from the beginning of the read-replica
5 participants