You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During checkpointing large SST files can get corrupted. Since the SST files are shared among checkpoints, this will not be resolved by future checkpoints
To Reproduce
Add values to statestore such that SST files are larger than 128K. Wait for the checkpoints to happen. This would require multiple attempts
Expected behavior
Checkpoints should not get corrupted.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Fix SST File corruption during checkpointing
### Motivation
Since the SST files are shared among checkpoints, this will not be resolved by future checkpoints. We will fail to restore all future checkpoints that depend on this file.
### Changes
The record is sent asynchronously. We need to use a copy of the passed buffer
in the record. The ownership is retained by the caller and will be potentially
changed by the caller. In case of corruption the later blocks were
overwriting the previous blocks resulting in corruption
Master Issue: #2563
Reviewers: Andrey Yegorov <None>, Enrico Olivelli <eolivelli@gmail.com>, Matteo Merli <mmerli@apache.org>
This closes#2564 from sursingh/fix-sst-corruption, closes#2563
Fix SST File corruption during checkpointing
### Motivation
Since the SST files are shared among checkpoints, this will not be resolved by future checkpoints. We will fail to restore all future checkpoints that depend on this file.
### Changes
The record is sent asynchronously. We need to use a copy of the passed buffer
in the record. The ownership is retained by the caller and will be potentially
changed by the caller. In case of corruption the later blocks were
overwriting the previous blocks resulting in corruption
Master Issue: #2563
Reviewers: Andrey Yegorov <None>, Enrico Olivelli <eolivelli@gmail.com>, Matteo Merli <mmerli@apache.org>
This closes#2564 from sursingh/fix-sst-corruption, closes#2563
BUG REPORT
Describe the bug
During checkpointing large SST files can get corrupted. Since the SST files are shared among checkpoints, this will not be resolved by future checkpoints
To Reproduce
Add values to statestore such that SST files are larger than 128K. Wait for the checkpoints to happen. This would require multiple attempts
Expected behavior
Checkpoints should not get corrupted.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: