Skip to content

Conversation

@tsreaper
Copy link
Contributor

@tsreaper tsreaper commented Nov 9, 2022

(Cherry-picked from #354)

Consider the following scenario when snapshot committing is slow:

  • A writer produces some records at checkpoint T.
  • It produces no record at checkpoint T+1 and is closed.
  • It produces some records at checkpoint T+2. It will be reopened and read the latest sequence number from disk. However snapshot at checkpoint T may not be committed so the sequence number it reads might be too small.

In this scenario, records from checkpoint T may overwrite records from checkpoint T+2 because they have larger sequence numbers.

This PR fixes this bug by comparing last modified commit identifier and the latest committed identifier before closing a writer. If by comparing we found that the last modification is already committed, we can safely close the writer.

@tsreaper tsreaper merged commit fe0f333 into apache:release-0.2 Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant