Skip to content

KAFKA-13499: Avoid restoring outdated records#22115

Merged
bbejeck merged 23 commits intoapache:trunkfrom
gabriellefu:restoring_window
May 4, 2026
Merged

KAFKA-13499: Avoid restoring outdated records#22115
bbejeck merged 23 commits intoapache:trunkfrom
gabriellefu:restoring_window

Conversation

@gabriellefu
Copy link
Copy Markdown
Contributor

@gabriellefu gabriellefu commented Apr 22, 2026

  1. Expose the retentionPeriod length to storeMetadata
  2. In prepareChangelogs(), switch it from always seektobeginning if
    checkpoint doesn't exist to seek to certain timestamp to avoid restoring
    outdated records.
  3. Change from the :
    Instead of the wall clock, use the latest timestamp in the changelog as
    the latest time, and seek from the timestamp of
    latest_changelog_stamp_time-rention_period.

Reviewers: TengYao Chi frankvicky@apache.org, Bill Bejeck
bbejeck@apache.org

@github-actions github-actions Bot added triage PRs from the community streams clients labels Apr 22, 2026
@gabriellefu
Copy link
Copy Markdown
Contributor Author

The failed smoke test now is passed:
image

Copy link
Copy Markdown
Contributor

@frankvicky frankvicky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gabriellefu Thanks for the PR.
Please run ./gradlew clean spotlessApply to fix the CI fail.

@github-actions github-actions Bot removed the triage PRs from the community label Apr 23, 2026
Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gabriellefu I made a pass

@gabriellefu gabriellefu requested a review from bbejeck April 29, 2026 14:49

newPartitionsWithoutStartOffset.add(partition);
final long retentionPeriod = storeMetadata.retentionPeriod();
if (retentionPeriod > 0 && retentionPeriod != Long.MAX_VALUE) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gabriellefu I was playing around some more and I think I found something else - New standby tasks won't have a valid endOffset, so they need to be filtered out. Otherwise with the restore consumer's auto.offset.reset=none every batched partition falls back to seek-to-beginning.

So we can update the if block to this
if (retentionPeriod > 0 && retentionPeriod != Long.MAX_VALUE && endOffset != null && endOffset > 0)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used restoreConsumer.endOffsets() not which should be able to solve the standby task problem

@gabriellefu gabriellefu force-pushed the restoring_window branch 2 times, most recently from 200526e to 3c1f8d1 Compare May 1, 2026 19:58
@bbejeck
Copy link
Copy Markdown
Member

bbejeck commented May 4, 2026

Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gabriellefu ! LGTM

@bbejeck bbejeck merged commit 94b6886 into apache:trunk May 4, 2026
22 checks passed
@bbejeck
Copy link
Copy Markdown
Member

bbejeck commented May 4, 2026

Merged #22115 into trunk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants