[v23.1.x] cloud_storage: fix HWM on read replicas #9770

andrwng · 2023-03-31T15:54:53Z

Pre-emptive backport of #9493

We were previously doing no offset translation of the HWM in read replicas, meaning the returned offset would almost always be higher than it should've been.

This PR updates the behavior of delta_offset_end to return the delta one past the end of each segment, allowing translation to occur by translating the next offset after the end with delta_offset_end (vs translating the end and adding 1).

The caveat here is reconciling the updated semantics with data written in older versions. We need to be mindful that older versions may have a delta_offset_end that is 1 lower than what it should be, meaning the reported HWM and next_kafka_offset calls may be 1 higher than they should be.

Looking at their usages, it seems like this should be a benign change, but it would be good to be wary of this moving forward.

Fixes #9513

Backports Required

Release Notes

Bug Fixes

Fixed a bug that would result in read replicas reporting a high watermark that was too high.

CONFLICT: - ducktae test parameterization - missing last_segments() call We were previously doing no offset translation of the HWM in read replicas, meaning the returned offset would almost always be higher than it should've been. This commit updates to translate based on the `delta_offset_end` of the last segment.

CONFLICT: - no last_segment() call in manifest The field is currently not expressive enough to report the HWM, the Kafka offset just past the end of a partition. Currently each segment records delta_offset_end as the difference between Kafka and RP offset at the end of the log: c: config batch d: data batch delta: computed by counting the number of non-data batches prior to the given offset kafka = model - delta type: c c d d d c c c c +-----------------------------------------+ +-----------+ | | | | | | | | | | | model:| 0 1 2 ... 138 139 140 | | 141 142 | | | | | | | | | | | | +-----------------------------------------+ +-----------+ delta: 0 1 2 ... 2 2 3 4 5 6 +-----------------------------------------+ +-----------+ | | | | | | | kafka: 0 0 0 1.. 136 137 137 | | 137 137 | | | | | | | | +-----------------------------------------+ +-----------+ In the above example, consider when the first segment ([0, 140]) is uploaded, and the second ([141, 142]) isn't. A read replica would previously report the HWM as 138 (140 - 3 + 1), when really it should be 137, since the highest data batch is 136. To translate the correct HWM, we would need to translate 141 and record its delta as 4, giving us enough information in metadata to tell that 140 is a config batch. To remediate this, this commit changes the existing delta_offset_end field to translate the next offset, and expecting the next Kafka offset to be translated with the next offset. Regarding backwards compatibility, conditionally increasing the delta_offset_end value means older data will over-report by 1. This should be safe, given the value is currently only used for optimizations that skip segments with no data batches. Fixes redpanda-data#9513

VladLazar · 2023-03-31T16:11:36Z

src/v/cloud_storage/partition_manifest.cc

@@ -263,12 +263,12 @@ const model::offset partition_manifest::get_last_offset() const {
 }

 const std::optional<kafka::offset>
-partition_manifest::get_last_kafka_offset() const {
+partition_manifest::get_next_kafka_offset() const {
    auto last_seg = last_segment();


partition_manifest::last_segment doesn't exist on this version. Did you mean to add it?

Nice spot, this was unintentional. I just removed it.

CONFLICT: - ducktape headers - fast_uploads flag not in this branch A fix is included in this patch series that fixes read replica high watermark translation. To validate this, this commit adds a test that performs a version from a previously broken version, and validates equality with respect to the source cluster.

vshtokman · 2023-04-07T18:24:01Z

v22.3.x backport: #9727

github-actions bot added the area/redpanda label Mar 31, 2023

andrwng requested review from VladLazar and Lazin March 31, 2023 15:58

andrwng added 2 commits March 31, 2023 09:09

VladLazar reviewed Mar 31, 2023

View reviewed changes

andrwng force-pushed the v23.1.x-rrr-hwm branch 3 times, most recently from bc39573 to c4bf22f Compare March 31, 2023 18:52

andrwng force-pushed the v23.1.x-rrr-hwm branch from c4bf22f to 6bc80e1 Compare March 31, 2023 18:54

VladLazar approved these changes Apr 3, 2023

View reviewed changes

andrwng merged commit 2215d7a into redpanda-data:v23.1.x Apr 3, 2023

RafalKorepta added this to the v23.1.5 milestone Apr 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v23.1.x] cloud_storage: fix HWM on read replicas #9770

[v23.1.x] cloud_storage: fix HWM on read replicas #9770

andrwng commented Mar 31, 2023 •

edited

Loading

VladLazar Mar 31, 2023

andrwng Mar 31, 2023

vshtokman commented Apr 7, 2023

[v23.1.x] cloud_storage: fix HWM on read replicas #9770

[v23.1.x] cloud_storage: fix HWM on read replicas #9770

Conversation

andrwng commented Mar 31, 2023 • edited Loading

Backports Required

Release Notes

Bug Fixes

VladLazar Mar 31, 2023

Choose a reason for hiding this comment

andrwng Mar 31, 2023

Choose a reason for hiding this comment

vshtokman commented Apr 7, 2023

andrwng commented Mar 31, 2023 •

edited

Loading