[v22.3.x] cloud_storage: fix timequery edge cases #9876

vbotbuildovich · 2023-04-06T12:13:05Z

Backport of PR #9815
Fixes #9875,

The local GC test issues many requests in parallel, which can make it awkward to read logs. On failure, issue some queries sequentially to the offsets that failed, which also gives us a measurement of whether the failure was transient or permanent. Related: redpanda-data#9669 (cherry picked from commit 8ddab78)

_start_offset is only set once some background housekeeping happens: on a freshly created partition with a few records produced into it, it can be -1. In this case when we read the start timestamp, we should not try to find a segment containing this negative start offset. We may later refactor to avoid having non-empty logs carry a confusingly -1 start_offset. (cherry picked from commit f440c89)

... so that it may be executed against Apache Kafka to check the compatibility of the behavior. (cherry picked from commit 8693ad2)

This was incorrectly expecting a -1 when querying below the start of the log: the Kafka-compatible behavior is to return the first offset in the log with a timestamp greater than the query (i.e. the lwm of the log) (cherry picked from commit e2e4186)

...where the queried timestamp falls equal to the local log start_offset, or between the local log's start_timestamp() and start_offset. The existing code was incorrectly assuming that start_timestamp pointed to the same record as start_offset, whereas actually start_timestamp gives the timestamp at the start of the first segment, and start_offset may point to some location within a segment. This would result in timequery returning an offset too high if there was data in tiered storage that could have found the timestamp at a lower offset, or returning an offset when it should have returned null, if there was not tiered storage data available before start_offset. Fixes redpanda-data#9669 (cherry picked from commit f3f77f0)

BenPope · 2023-04-10T18:53:43Z

Ping

Methods 'rbegin()' and 'rend()' of the partition_manifest are only used to get last value in the manifest. The methods are replaced with 'last_segment' method that returns optional value which may contain a segment_meta value of the last uploaded segment in the partition. (cherry picked from commit 78a70b3)

piyushredpanda · 2023-04-13T21:56:07Z

Failure appears to be #10024

andrwng

Yep, the only failure is a scaling_up_test which isn't related to cloud storage bits

jcsp added 5 commits April 6, 2023 12:13

tests: refactor test_timequery_below_start_offset

c41022f

... so that it may be executed against Apache Kafka to check the compatibility of the behavior. (cherry picked from commit 8693ad2)

vbotbuildovich added this to the v22.3.x-next milestone Apr 6, 2023

vbotbuildovich added the kind/backport PRs targeting a stable branch label Apr 6, 2023

github-actions bot added the area/redpanda label Apr 6, 2023

piyushredpanda modified the milestones: v22.3.x-next, v22.3.16 Apr 10, 2023

BenPope assigned jcsp Apr 10, 2023

BenPope requested review from jcsp and VladLazar April 10, 2023 20:35

jcsp marked this pull request as ready for review April 12, 2023 19:06

andrwng approved these changes Apr 13, 2023

View reviewed changes

andrwng merged commit 83e6286 into redpanda-data:v22.3.x Apr 13, 2023

BenPope mentioned this pull request Apr 13, 2023

[v22.3.x] CI Failure (assert not any([e > 0 for e in errors])) in TimeQueryTest.test_timequery_with_local_gc #9875

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v22.3.x] cloud_storage: fix timequery edge cases #9876

[v22.3.x] cloud_storage: fix timequery edge cases #9876

vbotbuildovich commented Apr 6, 2023

BenPope commented Apr 10, 2023

piyushredpanda commented Apr 13, 2023

andrwng left a comment

[v22.3.x] cloud_storage: fix timequery edge cases #9876

[v22.3.x] cloud_storage: fix timequery edge cases #9876

Conversation

vbotbuildovich commented Apr 6, 2023

BenPope commented Apr 10, 2023

piyushredpanda commented Apr 13, 2023

andrwng left a comment

Choose a reason for hiding this comment