New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
snap: delete stale snapshot immediately #2429
Conversation
CI failed |
any update @BusyJay |
can we use failpoint for this test now? |
I will add a test case once #2474 is merged, which introduces failpoint cases into integration tests. |
PTAL |
LGTM |
PTAL @overvenus |
src/raftstore/store/worker/region.rs
Outdated
@@ -109,6 +109,7 @@ impl SnapContext { | |||
&raw_snap, | |||
region_id | |||
)); | |||
fail_point!("handle_gen", region_id == 1, |_| Ok(())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better adding some comments, I have trouble understanding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
/run-integration-tests |
/run-integration-tests |
When replicating a peer, If it splits frequently, its snapshot may be always overlapped with other regions, so the snapshots will never be applied. Unfortunately, GC check can do nothing about the it, because the snapshot is never applied so
compacted_index
is always 0 which is less than any valid snapshot's index. And it's hard to check it in GC too. Because you can't tell if the snapshot is verified to be stale or just still buffered in the channel.So this pr deletes the snapshot files immediately once it fails the snapshot check.