-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: let preemptive snapshots go through Raft #7833
Conversation
Looks reasonable to me. Acquiring exclusive access to a portion of the key space should amount to holding a read lock on an interval tree and validating that no other replica overlaps with that portion of key space. cc @arjunravinarayan #7830. Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 1 of 1 files at r3, 2 of 2 files at r4, 4 of 4 files at r5. storage/replica_command.go, line 2929 [r3] (raw file):
wrong place, obviously storage/replica_command.go, line 2941 [r4] (raw file):
needs to include the replica ID itself storage/replica_raftstorage.go, line 590 [r5] (raw file):
is that surprising? seems expected - an old snapshot is still useful if the remainder of the work can be done by replaying the log. storage/store.go, line 2228 [r4] (raw file):
this was confusing. it's not a replica ID, it's a raft ID. storage/store.go, line 2227 [r5] (raw file):
proceed? storage/store.go, line 2254 [r5] (raw file):
looks like this can avoid checking the error and instead always:
storage/store.go, line 2258 [r5] (raw file):
more words? Comments from Reviewable |
Reviewed 1 of 1 files at r1, 1 of 2 files at r4, 4 of 4 files at r5. storage/replica_raftstorage.go, line 576 [r5] (raw file):
I believe raft's sanity checks here are sufficient and we don't need to do our own checking here now that we're passing through raft's validation. storage/replica_raftstorage.go, line 590 [r5] (raw file):
|
Review status: all files reviewed at latest revision, 17 unresolved discussions, some commit checks failed. storage/store.go, line 2303 [r5] (raw file):
|
Review status: all files reviewed at latest revision, 17 unresolved discussions, some commit checks failed. storage/replica_raftstorage.go, line 590 [r5] (raw file):
Is this documented? I think we should seriously consider adding preemptive snapshot support upstream, or we'll never stop depending on knowledge of upstream's implementation. storage/store.go, line 2228 [r4] (raw file):
|
Review status: all files reviewed at latest revision, 17 unresolved discussions, some commit checks failed. storage/replica_command.go, line 2929 [r3] (raw file):
|
Reviewed 7 of 7 files at r6. storage/replica.go, line 272 [r6] (raw file):
don't think you need this raft.Storage() storage/store.go, line 2304 [r6] (raw file):
ditto storage/store.go, line 2330 [r6] (raw file):
what's the TODO here? Comments from Reviewable |
Review status: all files reviewed at latest revision, 19 unresolved discussions, some commit checks failed. storage/client_raft_test.go, line 1361 [r5] (raw file):
This test was disabled because it was flaky. Sometimes non-preemptive snapshots occur and I'm pretty sure it is due to the async sending of preemptive snapshots. storage/replica_command.go, line 2943 [r5] (raw file):
See my comment on the preemptive snapshot test. I think this should really be a synchronous RPC. Even better, we should make this a streaming RPC and send the snapshot in bits, waiting for the remote to write it to disk. Both these items can wait for subsequent PRs. storage/replica_raftstorage.go, line 631 [r6] (raw file):
If you want to be guaranteed this is the last deferred action, you should put this defer up right after the creation of the batch. Otherwise it is easy for someone else to add a Is there a reason you prefer using a storage/store.go, line 2327 [r6] (raw file):
Just noticed that applying a snapshot here will prevent other raft processing on the store. When we eventually make normal raft processing concurrent in Comments from Reviewable |
Reviewed 7 of 7 files at r6. storage/replica_raftstorage.go, line 590 [r5] (raw file):
|
I started this branch as an experiment (to verify our assumption that we could avoid out-of-Raft snapshots), and that seems to have worked out well. Personally, I don't feel that we should re-enable preemptive snapshots due to the related-but-out-of-scope concurrency issues (#7830), so my suggestion is to land this but take out the temporary commit which re-enabled preemptive snapshots until we have those issues sorted out. See also #7875 about removing manual HardState updates (of which I removed one in this PR). Review status: 1 of 7 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed. storage/client_raft_test.go, line 1361 [r5] (raw file):
|
Reviewed 6 of 6 files at r8. Comments from Reviewable |
Why is #7830 a reason to block preemptive snapshots? I don't see how this makes us any more vulnerable to overlapping ranges than we were before. Reviewed 1 of 7 files at r9, 6 of 6 files at r10. storage/store.go, line 2259 [r5] (raw file):
|
Reviewed 1 of 7 files at r9, 6 of 6 files at r10. Comments from Reviewable |
Review status: all files reviewed at latest revision, 9 unresolved discussions, some commit checks failed. storage/store.go, line 2259 [r5] (raw file):
|
Couldn't the fact that a preemptive snapshot can arrive roughly along with a corresponding Raft snapshot cause issues? Perhaps not (due to them having the same range), but still. If you're more comfortable re-enabling preemptive snapshots, I'm going to defer to that judgement (in which case I should begin fixing up the corresponding tests). Review status: 1 of 7 files reviewed at latest revision, 9 unresolved discussions, some commit checks pending. Comments from Reviewable |
I seem to have introduced a test failure:
fails after <1k iters. Just a reminder to myself to sort out, may not be a big deal. |
@tschottdorf I just added |
Hmm, the test is failing at |
It doesn't repro without the PR (at least not within 20k iters on GCE when it failed in <1k with the PR). |
@tschottdorf You're correct that it doesn't repro without this branch. Do you need help investigating the failure? |
I'm sidetracked looking at Jepsen, so if you have spare capacity I'd On Wed, Jul 20, 2016 at 9:47 AM Peter Mattis notifications@github.com
-- Tobias |
Ok. I can reproduce the problem locally using your branch. |
The |
Fixes a test failure seen in cockroachdb#7833.
While I wait for @bdarnell's feedback, I'll be fixing the preemptive snapshot tests (tomorrow) since the general consensus seems to be that we do want to enable them again. |
except for the Reviewed 6 of 6 files at r12. storage/store.go, line 2259 [r5] (raw file):
|
Change the way in which preemptive snapshots are applied to use a temporary Raft group. This lets Raft perform the verification of the snapshot against its on-disk state and allows us to remove some of the ad-hoc checks we have added recently. Some fallout from this was making Raft config creation reusable, trimming down `(*Replica).applySnapshot` and minor improvements on the sending side. Re-enabled preemptive snapshots, though the corresponding test is still flaky and remains disabled. It is already tracked separately. See cockroachdb#6144.
The test as written invited flakiness and seemed to have successfully procured it with the preemptive snapshot changes when combined with a lease acquisition in the replica change trigger.
Introduce a synchronous mode for RaftTransport. Abort the ChangeReplicas operation if the snapshot cannot be sent. This required relaxing some of the sanity checks introduced in cockroachdb#7833 in order to get all the tests passing reliably. Notably, a replica with an active raft group can now accept preemptive snapshots. Fixes cockroachdb#7819 Closes cockroachdb#8604 May address cockroachdb#8007, cockroachdb#8056, and cockroachdb#8594
Introduce a synchronous mode for RaftTransport. Abort the ChangeReplicas operation if the snapshot cannot be sent. This required relaxing some of the sanity checks introduced in cockroachdb#7833 in order to get all the tests passing reliably. Notably, a replica with an active raft group can now accept preemptive snapshots. Fixes cockroachdb#7819 Closes cockroachdb#8604 May address cockroachdb#8007, cockroachdb#8056, and cockroachdb#8594
Introduce a synchronous mode for RaftTransport. Abort the ChangeReplicas operation if the snapshot cannot be sent. This required relaxing some of the sanity checks introduced in cockroachdb#7833 in order to get all the tests passing reliably. Notably, a replica with an active raft group can now accept preemptive snapshots. Fixes cockroachdb#7819 Closes cockroachdb#8604 May address cockroachdb#8007, cockroachdb#8056, and cockroachdb#8594
Introduce a synchronous mode for RaftTransport. Abort the ChangeReplicas operation if the snapshot cannot be sent. This required relaxing some of the sanity checks introduced in cockroachdb#7833 in order to get all the tests passing reliably. Notably, a replica with an active raft group can now accept preemptive snapshots. Fixes cockroachdb#7819 Closes cockroachdb#8604 May address cockroachdb#8007, cockroachdb#8056, and cockroachdb#8594
Change the way in which preemptive snapshots are applied to use
a temporary Raft group. This lets Raft perform the verification
of the snapshot against its on-disk state and allows us to remove
some of the ad-hoc checks we have added recently.
Some fallout from this was making Raft config creation reusable,
trimming down
(*Replica).applySnapshot
and minor improvementson the sending side.
See #6144.
cc @tamird @bdarnell @petermattis @arjunravinarayan
This change is