Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: sep-raft-log: make log application stand-alone #75729

Open
tbg opened this issue Jan 31, 2022 · 7 comments
Open

kvserver: sep-raft-log: make log application stand-alone #75729

tbg opened this issue Jan 31, 2022 · 7 comments
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects

Comments

@tbg
Copy link
Member

tbg commented Jan 31, 2022

Is your feature request related to a problem? Please describe.

Today, there is no "library-style" way of crafting a CockroachDB log of commands.

We can in principle craft individual commands by evaluating requests (the
batcheval package), but below that you need a *Replica which strings
together the raft log, and the apply loop.

We have felt for a long time that abstractly we could benefit from better
modularization here, but now we are starting to see this come up in projects we
are working on, for example

// ReplicasStorage does interpret the raft state (all the unreplicated
// range-ID local key names prefixed by Raft), and the RangeTombstoneKey. This
// is necessary for it to be able to maintain invariants spanning the raft log
// and the state machine (related to raft log truncation, replica lifetime
// etc.), including reapplying raft log entries on restart, to the state
// machine. All accesses (read or write) to the raft log and RangeTombstoneKey
// must happen via ReplicasStorage. ReplicasStorage does not by itself apply
// committed raft log entries to the state machine in a running system -- this
// is because state machine application in a running system has complex
// data-structure side-effects that are outside the scope of ReplicasStorage.

and similarly Loss Of Quorum Recovery tools could benefit from tools to interpret the raft log at the right level of abstraction.

Describe the solution you'd like

Continue the work begun by #39254
and further extract the logic pertaining to entry application so that it can be
used to apply log entries outside of the context of a fully instantiated
Replica. I think most of the work will be to wean the apply code off the
many direct accesses to replica.mu and the associated fields.

As an acceptance criterion, we should be able to write a unit test that sets up
a test cluster, appends some commands to the log, stops the cluster and
subsequently re-applies the log entries to an empty state, arriving at an
identical state machine.

Ideally, we also get a testing win by allowing tests to be set up with bespoke
raft logs that exercise hand-crafted sequences of commands; this is very difficult
today and essentially doesn't happen for that very reason.

Note: inclusion of splits/merges will add complexity. We may not need this in
the first pass but should think about how this could work in the future.

Epic: CRDB-220
Jira issue: CRDB-12807

@tbg tbg added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Jan 31, 2022
@tbg tbg added this to Incoming in Replication via automation Jan 31, 2022
@tbg
Copy link
Member Author

tbg commented Feb 4, 2022

20220204_140139

@tbg
Copy link
Member Author

tbg commented Feb 4, 2022

I'm not sure this photo of my whiteboard is helpful for anyone but me, but I spent a little bit of time looking into what's needed here. I think there are a few "themes":

  • the "log encoding" portion on the right-hand side is done manually in multiple places today. A little abstraction over this would be useful and this should be straightforward to write as well.
  • the apply-side (left hand side) is already nicely modular thanks to prior work by @nvanbenschoten (all the way back in '19!), but lots of the pieces of code that would need to be reusable aren't quite yet, so we need to pull out things like checkForcedError to make them re-usable. I expect most of these to be surgical.
  • I don't think we're too far off from the evaluate-side either. Assuming everything the commands we want to manufacture programmatically is populated in the batcheval.EvalContext, we will get a RaftCommand back and making the lease/closed timestamp code reusable shouldn't be hard either, especially since this is all much easier when operating offline without concurrency or in-memory state. We don't care much about the eval side outside of testing, so supporting just a subset of commands when we need it is good enough.
  • to support splits and merges, we would similarly need to make all the disk-touching components of these operations' side effects reusable, and probably restructure the code a bit to make sure there aren't two parallel control flows that can diverge easily. It's unclear if we need to support those anytime soon, just putting it out there.

tbg added a commit to tbg/cockroach that referenced this issue Feb 5, 2022
I'm starting to type code for cockroachdb#75729 and it's clear immediately
that we shouldn't have these protos in the `kvserver` package
as this will make dependency cycles hard to avoid.

We've long introduced the `kvserverpb` package, but simply
didn't pull all the protos into it.

This commit moves `raft.proto` to `kvserverpb`.

There's still a bit of protobuf left in `api.proto`, but
that can be handled separately.

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Feb 5, 2022
I'm starting to type code for cockroachdb#75729 and it's clear immediately
that we shouldn't have these protos in the `kvserver` package
as this will make dependency cycles hard to avoid.

We've long introduced the `kvserverpb` package, but simply
didn't pull all the protos into it.

This commit moves `raft.proto` to `kvserverpb`.

There's still a bit of protobuf left in `api.proto`, but
that can be handled separately.

Release note: None
craig bot pushed a commit that referenced this issue Feb 5, 2022
76113: kvserver: move raft.proto to kvserverpb r=erikgrinaker a=tbg

kvserver: move raft.proto to kvserverpb

I'm starting to type code for #75729 and it's clear immediately
that we shouldn't have these protos in the `kvserver` package
as this will make dependency cycles hard to avoid.

We've long introduced the `kvserverpb` package, but simply
didn't pull all the protos into it.

This commit moves `raft.proto` to `kvserverpb`.

There's still a bit of protobuf left in `api.proto`, but
that can be handled separately (#76114)

Release note: None


Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
tbg added a commit to tbg/cockroach that referenced this issue Feb 6, 2022
This is similar to cockroachdb#76113 in that it prevents import cycles in cockroachdb#75729.
We don't want the code moved here to necessarily stay in `kvserverbase`,
but that is a good place for now.

Release note: None
craig bot pushed a commit that referenced this issue Feb 6, 2022
76120: kvserver: move raft version encoding to `kvserverbase` r=erikgrinaker a=tbg

This is similar to #76113 in that it prevents import cycles in #75729.
We don't want the code moved here to necessarily stay in `kvserverbase`,
but that is a good place for now.

Release note: None


Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
tbg added a commit to tbg/cockroach that referenced this issue Feb 6, 2022
This commit introduces a new package `raftlog`. Aspirationally, this
will at some point become the de facto way of encapsulating the raft log
encoding and may play a role in programmatically constructing and
inspecting raft logs (e.g.  for testing).

For now, we introduce two concepts:

- `raftlog.Entry`, which wraps a `raftpb.Entry` and all of the
  information derived from it, such as the command ID, the
`kvserverpb.RaftCommand`, the configuration change (if any), etc.

- `raftlog.Iterator`, a way to iterate over the raft log in terms of
  `raftlog.Entry` (as opposed to `raftpb.Entry` which requires lots of
manual processing).

Both are then applied across the codebase, concretely:

- `loqrecovery` is simplified via `raftlog.Iterator` to pull commit
  triggers out of the raft log.
- debug pretty-printing is simpler thanks to use of `raftlog.Entry`.
- `decodedRaftEntry` is now structurally a `raftpb.Entry`, and again
  lots manual custom unmarshaling code evaporates.

It's currently difficult to create "interesting" raft log entries if
trying to stay away from manual population of large datastructures
(which is prone to rotting), so there's zero unit testing of
`raftlog.Iterator`. However, the code is not new, instead it was
deduplicated from a few places, and is now tested through all of them;
so I don't feel to bad about it. I still think it is a priority to be
able to "comfortably" create at least "simple" raft logs, meaning we
need to be able to string together `batcheval` and entry creation at
least in a rudimentary fashion. I intend to look into this next and
add comprehensive unit tests for `raftlog.{Entry,Iterator}`.

Touches cockroachdb#75729.

Release note: None
@tbg
Copy link
Member Author

tbg commented Feb 6, 2022

The right side of the picture ("Log encoding") is librarified, at least on the read path, in #76126. We still "manually" create these proposals when we run config changes:

// Coordinate proposing the command to etcd/raft.
if crt := p.command.ReplicatedEvalResult.ChangeReplicas; crt != nil {
// Flush any previously batched (non-conf change) proposals to
// preserve the correct ordering or proposals. Later proposals
// will start a new batch.
if err := proposeBatch(raftGroup, b.p.replicaID(), ents); err != nil {
firstErr = err
continue
}
ents = ents[len(ents):]
confChangeCtx := kvserverpb.ConfChangeContext{
CommandID: string(p.idKey),
Payload: p.encodedCommand,
}
encodedCtx, err := protoutil.Marshal(&confChangeCtx)
if err != nil {
firstErr = err
continue
}
cc, err := crt.ConfChange(encodedCtx)
if err != nil {
firstErr = err
continue
}
if err := raftGroup.ProposeConfChange(
cc,
); err != nil && !errors.Is(err, raft.ErrProposalDropped) {
// Silently ignore dropped proposals (they were always silently

tbg added a commit to tbg/cockroach that referenced this issue Feb 6, 2022
This commit introduces a new package `raftlog`. Aspirationally, this
will at some point become the de facto way of encapsulating the raft log
encoding and may play a role in programmatically constructing and
inspecting raft logs (e.g.  for testing).

For now, we introduce two concepts:

- `raftlog.Entry`, which wraps a `raftpb.Entry` and all of the
  information derived from it, such as the command ID, the
`kvserverpb.RaftCommand`, the configuration change (if any), etc.

- `raftlog.Iterator`, a way to iterate over the raft log in terms of
  `raftlog.Entry` (as opposed to `raftpb.Entry` which requires lots of
manual processing).

Both are then applied across the codebase, concretely:

- `loqrecovery` is simplified via `raftlog.Iterator` to pull commit
  triggers out of the raft log.
- debug pretty-printing is simpler thanks to use of `raftlog.Entry`.
- `decodedRaftEntry` is now structurally a `raftpb.Entry`, and again
  lots manual custom unmarshaling code evaporates.

It's currently difficult to create "interesting" raft log entries if
trying to stay away from manual population of large datastructures
(which is prone to rotting), so there's zero unit testing of
`raftlog.Iterator`. However, the code is not new, instead it was
deduplicated from a few places, and is now tested through all of them;
so I don't feel to bad about it. I still think it is a priority to be
able to "comfortably" create at least "simple" raft logs, meaning we
need to be able to string together `batcheval` and entry creation at
least in a rudimentary fashion. I intend to look into this next and
add comprehensive unit tests for `raftlog.{Entry,Iterator}`.

Touches cockroachdb#75729.

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Feb 6, 2022
This commit introduces `TestCreateManualProposal` which attempts to
carry out an "offline" (i.e. `Replica`-less) version of evaluation &
entry creation (i.e. the middle and right-hand side [here]). We start
with a `*roachpb.PutRequest` and end up with a payload fit for a
`(raftpb.Entry).Data`.

This is helpful since it provides an easily digestible "life of a
proposal" read-along, but more importantly represents a step towards
establishing clearer interfaces for the various stages of a request -
evaluation, replication, and application - and in particular separation
of these stages from the surrounding `Replica`. The test helps pinpoint
pieces of code that will need to be worked on in order to reach one of
the goals of cockroachdb#75729, being able to create a meaningful raft log
programmatically in a unit test without also instantiating a `Replica`
and, later, also applying raft logs in a similar way.

As a first step, this commit extracts a method `raftCmdToPayload` from
`(*Replica).propose` which translates a `kvserverpb.RaftCommand` (which
is roughly the result of evaluating a `BatchRequest`) into a payload
that can be handed to raft for replication. This extracted method is far
from clean - it takes a few crufty parameters, lives in `kvserver`
(where it is hard to import) and also uses the logging singleton which
is undesirable for a library method. However, it's enough to keep the
ball rolling, and can be improved on later.

[here]: cockroachdb#75729 (comment)

Touches cockroachdb#75729.

Release note: None
@sumeerbhola
Copy link
Collaborator

I don't think we're too far off from the evaluate-side either. Assuming everything the commands we want to manufacture programmatically is populated in the batcheval.EvalContext, we will get a RaftCommand back and making the lease/closed timestamp code reusable shouldn't be hard either, especially since this is all much easier when operating offline without concurrency or in-memory state. We don't care much about the eval side outside of testing, so supporting just a subset of commands when we need it is good enough.

We don't need this for ReplicasStorage, and I'm guessing we don't need it for loss of quorum either. Is that correct? Why do we need to write tests that need to start with a BatchRequest and don't have a Replica (just trying to understand)? Isn't populating batcheval.EvalContext a tall order given the very broad interface?

@tbg
Copy link
Member Author

tbg commented Feb 9, 2022

This isn't tracking exactly the work needed for ReplicasStorage (which will only need to apply some kinds of entries excluding splits and merges, iirc). It's generally what we want to exist for replication in a few places, and while there isn't a super pressing need, being able to do this shows us that we're doing something right.

@tbg
Copy link
Member Author

tbg commented Feb 9, 2022

The concrete example where you might want it both for LOQ and for ReplicasStorage is that for testing you'll want to populate custom raft logs. Having to spin up servers and manufacture them in a live system is now how this should work. Yet, that's what we do today:

// TestFindUpdateDescriptor verifies that we can detect changes to range
// descriptor in the raft log.
// To do this we split and merge the range which updates descriptor prior to
// spawning or subsuming RHS.
func TestFindUpdateDescriptor(t *testing.T) {
defer leaktest.AfterTest(t)()
defer log.Scope(t).Close(t)
ctx := context.Background()
const testNode = 0
var testRangeID roachpb.RangeID
var rHS roachpb.RangeDescriptor
var lHSBefore roachpb.RangeDescriptor
var lHSAfter roachpb.RangeDescriptor
checkRaftLog(t, ctx, testNode,
func(ctx context.Context, tc *testcluster.TestCluster) roachpb.RKey {
scratchKey, err := tc.Server(0).ScratchRange()
require.NoError(t, err, "failed to get scratch range")
srk, err := keys.Addr(scratchKey)
require.NoError(t, err, "failed to resolve scratch key")
rd, err := tc.LookupRange(scratchKey)
testRangeID = rd.RangeID
require.NoError(t, err, "failed to get descriptor for scratch range")
splitKey := testutils.MakeKey(scratchKey, []byte("z"))
lHSBefore, rHS, err = tc.SplitRange(splitKey)
require.NoError(t, err, "failed to split scratch range")
lHSAfter, err = tc.Servers[0].MergeRanges(scratchKey)
require.NoError(t, err, "failed to merge scratch range")
require.NoError(t,
tc.Server(testNode).DB().Put(ctx, testutils.MakeKey(scratchKey, []byte("|first")),
"some data"),
"failed to put test value in LHS")
return srk
},
func(t *testing.T, ctx context.Context, reader storage.Reader) {
seq, err := loqrecovery.GetDescriptorChangesFromRaftLog(testRangeID, 0, math.MaxInt64, reader)
require.NoError(t, err, "failed to read raft log data")
requireContainsDescriptor(t, loqrecoverypb.DescriptorChangeInfo{
ChangeType: loqrecoverypb.DescriptorChangeType_Split,
Desc: &lHSBefore,
OtherDesc: &rHS,
}, seq)
requireContainsDescriptor(t, loqrecoverypb.DescriptorChangeInfo{
ChangeType: loqrecoverypb.DescriptorChangeType_Merge,
Desc: &lHSAfter,
OtherDesc: &rHS,
}, seq)
})
}

@mwang1026 mwang1026 moved this from Incoming to Backlog in Replication Feb 10, 2022
tbg added a commit to tbg/cockroach that referenced this issue Feb 11, 2022
This commit introduces a new package `raftlog`. Aspirationally, this
will at some point become the de facto way of encapsulating the raft log
encoding and may play a role in programmatically constructing and
inspecting raft logs (e.g.  for testing).

For now, we introduce two concepts:

- `raftlog.Entry`, which wraps a `raftpb.Entry` and all of the
  information derived from it, such as the command ID, the
`kvserverpb.RaftCommand`, the configuration change (if any), etc.

- `raftlog.Iterator`, a way to iterate over the raft log in terms of
  `raftlog.Entry` (as opposed to `raftpb.Entry` which requires lots of
manual processing).

Both are then applied across the codebase, concretely:

- `loqrecovery` is simplified via `raftlog.Iterator` to pull commit
  triggers out of the raft log.
- debug pretty-printing is simpler thanks to use of `raftlog.Entry`.
- `decodedRaftEntry` is now structurally a `raftpb.Entry`, and again
  lots manual custom unmarshaling code evaporates.

It's currently difficult to create "interesting" raft log entries if
trying to stay away from manual population of large datastructures
(which is prone to rotting), so there's zero unit testing of
`raftlog.Iterator`. However, the code is not new, instead it was
deduplicated from a few places, and is now tested through all of them;
so I don't feel to bad about it. I still think it is a priority to be
able to "comfortably" create at least "simple" raft logs, meaning we
need to be able to string together `batcheval` and entry creation at
least in a rudimentary fashion. I intend to look into this next and
add comprehensive unit tests for `raftlog.{Entry,Iterator}`.

Touches cockroachdb#75729.

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Feb 11, 2022
This commit introduces a new package `raftlog`. Aspirationally, this
will at some point become the de facto way of encapsulating the raft log
encoding and may play a role in programmatically constructing and
inspecting raft logs (e.g.  for testing).

For now, we introduce two concepts:

- `raftlog.Entry`, which wraps a `raftpb.Entry` and all of the
  information derived from it, such as the command ID, the
`kvserverpb.RaftCommand`, the configuration change (if any), etc.

- `raftlog.Iterator`, a way to iterate over the raft log in terms of
  `raftlog.Entry` (as opposed to `raftpb.Entry` which requires lots of
manual processing).

Both are then applied across the codebase, concretely:

- `loqrecovery` is simplified via `raftlog.Iterator` to pull commit
  triggers out of the raft log.
- debug pretty-printing is simpler thanks to use of `raftlog.Entry`.
- `decodedRaftEntry` is now structurally a `raftpb.Entry`, and again
  lots manual custom unmarshaling code evaporates.

It's currently difficult to create "interesting" raft log entries if
trying to stay away from manual population of large datastructures
(which is prone to rotting), so there's zero unit testing of
`raftlog.Iterator`. However, the code is not new, instead it was
deduplicated from a few places, and is now tested through all of them;
so I don't feel to bad about it. I still think it is a priority to be
able to "comfortably" create at least "simple" raft logs, meaning we
need to be able to string together `batcheval` and entry creation at
least in a rudimentary fashion. I intend to look into this next and
add comprehensive unit tests for `raftlog.{Entry,Iterator}`.

Touches cockroachdb#75729.

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Nov 16, 2022
`(*replicaAppBatch).Stage` roughly

1. checks whether to apply the `Command` as a no-op
2. runs a bunch of triggers (touching `*Replica` a lot)
3. stages the command into a `storage.Batch`
4. runs a bunch of triggers (touching `*Replica` a lot)
5. returns the cmd as a `CheckedCommand`

Standalone log application needs to do 1, 3, 5 but it only needs to do
parts of 2 and 4.

We're trying to morph `replicaAppBatch` into something that can apply
log entries both in regular and standalone mode, where all of the
interactions with the surrounding *Replica (if any) are pushed into an
interface handed to `replicaAppBatch`. The implementation of the
interface is either a standalone-specific handler, or a
*Replica-specific one (that relies heavily on the standalone handler to
avoid logic duplication).

This will take many commits to achieve but the goal is something like
this:

```go
// appBatch implements apply.Batch and Handler (for standalone
// application).
type appBatch struct {
  // All the fields that we need during standalone application.
  state ReplicaState
  batch storage.Batch
}

// Handler is used by replicaAppBatch to execute side effects of entry
// application. The implementation depends on regular vs standalone mode.
type Handler interface { // methods are called from Stage
  PreHandler // addSSTPreApply, maybeAcquireSplitLock, etc
curre
  PostHandler // runPreApplyTriggersAfterStagingWriteBatch, itemized
}

// replicaAppBatch implements apply.Batch. It also implements Handler,
// by first calling appBatch's Handler implementation and augmenting it
// with additional actions that don't apply in a standalone context.
type replicaAppBatch struct {
  appBatch
  r *Replica
  // More fields only required in *Replica context, like
  // `followerStoreWriteBytes` which is used for admission control
}
```

That exact plan is unlikely to entirely survive contact with reality but
something along those lines ought to work out.

Other than paving a particularly bumpy part of the road to stand-alone
log application, I expect this to also make the code more accessible, as
it is currently somewhat ad-hoc.

This first commit moves the lease applied index checks and the closed
timestamp assertions[^1] to this new scheme. Follow-up commits will do the
same for the other references to `*Replica` in `Stage` until `*Replica`
is only referenced from `*replicaAppBatchHandler`. At that point, we
can move `*Replica` from `replicaAppBatch` to `replicaAppBatchHandler`
and can declare victory.

This PR tries to minimize code movement. I left TODOs on everything that
I think should move in a future commit.

Touches cockroachdb#75729.

[^1]: these assertions are currently only run in "regular" mode but
sould also be ported to standalone mode in the future. For now it's
convenient to have them out of the way; better to have a standalone mode
with a few assertions missing earlier.

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Nov 16, 2022
`(*replicaAppBatch).Stage` roughly

1. checks whether to apply the `Command` as a no-op
2. runs a bunch of triggers (touching `*Replica` a lot)
3. stages the command into a `storage.Batch`
4. runs a bunch of triggers (touching `*Replica` a lot)
5. returns the cmd as a `CheckedCommand`

Standalone log application needs to do 1, 3, 5 but it only needs to do
parts of 2 and 4.

We're trying to morph `replicaAppBatch` into something that can apply
log entries both in regular and standalone mode, where all of the
interactions with the surrounding *Replica (if any) are pushed into an
interface handed to `replicaAppBatch`. The implementation of the
interface is either a standalone-specific handler, or a
*Replica-specific one (that relies heavily on the standalone handler to
avoid logic duplication).

This will take many commits to achieve but the goal is something like
this:

```go
// appBatch implements apply.Batch and Handler (for standalone
// application).
type appBatch struct {
  // All the fields that we need during standalone application.
  state ReplicaState
  batch storage.Batch
}

// Handler is used by replicaAppBatch to execute side effects of entry
// application. The implementation depends on regular vs standalone mode.
type Handler interface { // methods are called from Stage
  PreHandler // addSSTPreApply, maybeAcquireSplitLock, etc
curre
  PostHandler // runPreApplyTriggersAfterStagingWriteBatch, itemized
}

// replicaAppBatch implements apply.Batch. It also implements Handler,
// by first calling appBatch's Handler implementation and augmenting it
// with additional actions that don't apply in a standalone context.
type replicaAppBatch struct {
  appBatch
  r *Replica
  // More fields only required in *Replica context, like
  // `followerStoreWriteBytes` which is used for admission control
}
```

That exact plan is unlikely to entirely survive contact with reality but
something along those lines ought to work out.

Other than paving a particularly bumpy part of the road to stand-alone
log application, I expect this to also make the code more accessible, as
it is currently somewhat ad-hoc.

This first commit moves the lease applied index checks and the closed
timestamp assertions[^1] to this new scheme. Follow-up commits will do the
same for the other references to `*Replica` in `Stage` until `*Replica`
is only referenced from `*replicaAppBatchHandler`. At that point, we
can move `*Replica` from `replicaAppBatch` to `replicaAppBatchHandler`
and can declare victory.

This PR tries to minimize code movement. I left TODOs on everything that
I think should move in a future commit.

Touches cockroachdb#75729.

[^1]: these assertions are currently only run in "regular" mode but
sould also be ported to standalone mode in the future. For now it's
convenient to have them out of the way; better to have a standalone mode
with a few assertions missing earlier.

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Dec 8, 2022
For cockroachdb#75729 we will want to work on replicaAppBatch and decompose it into
a new type appBatch around which replicaAppBatch can be layered.

As a first baby step, move appbatch-related code to its own
file to physically separate it from the `apply.StateMachine`.

This is 100% mechanical (using Goland's refactoring tools).

Release note: None
@tbg tbg changed the title kvserver: make log application stand-alone kvserver: sep-raft-log make log application stand-alone Dec 8, 2022
@tbg tbg changed the title kvserver: sep-raft-log make log application stand-alone kvserver: sep-raft-log: make log application stand-alone Dec 8, 2022
craig bot pushed a commit that referenced this issue Dec 8, 2022
93237: kvserver: move replicaAppBatch to replica_app_batch.go r=pavelkalinnikov a=tbg

For #75729 we will want to work on replicaAppBatch and decompose it into
a new type appBatch around which replicaAppBatch can be layered.

As a first baby step, move appbatch-related code to its own
file to physically separate it from the `apply.StateMachine`.

This is 100% mechanical (using Goland's refactoring tools).

Epic: CRDB-220
Release note: None


Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
craig bot pushed a commit that referenced this issue Dec 21, 2022
93266: kvserver: refactor replicaAppBatch for standalone log application r=pavelkalinnikov a=tbg

This long (but individually small) sequence of commits moves `(*replicaAppBatch).Stage` close to the structure that was prototyped in #93265, where it has the following steps:

- command checks (standalone)
- testing interceptors (replica)
- pre-add triggers (standalone)
- pre-add triggers (replica)
- add (to pebble batch, standalone)
- post-add triggers (standalone)
- post-add triggers (replica)

In standalone application (e.g. for #93244) we'd use an `apply.Batch` that is an `appBatch`, instead of a `replicaAppBatch`, i.e. skip all of the (replica) steps above.

This PR doesn't get us all the way there - we still need to tease apart the `post-add triggers (replica)` step, which currently contains code that should be in `post-add triggers (standalone)`; this is best tackled in a separate PR since it's going to be quite a bit of work.

Touches #75729.

Epic: CRDB-220
Release note: None

Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
@tbg
Copy link
Member Author

tbg commented Feb 20, 2023

In #97347 we're seeing a tricky interaction between reproposals that we can't unit test without a significant rework. Having more stand-alone components for log application would help here.

craig bot pushed a commit that referenced this issue Jun 20, 2023
104401: kvserver: add TestCreateManyUnappliedProbes r=pavelkalinnikov a=tbg

This is the test used for #102953.

This PR also makes modest progress on #75729, not by making log application
stand-alone, but by making it somewhat less convoluted to manufacture log
entries programmatically. It would be desirable to be able to test the
replication layer in CRDB in the same way that Raft allows via the
InteractionEnv[^1].

[^1]: https://github.com/etcd-io/raft/blob/6bf4f7fe3122b064e0a0d76314298dca6f379fc7/interaction_test.go

Epic: none
Release note: none


105083: kvcoord: Release catchup reservation before re-acquire attempt r=miretskiy a=miretskiy

Release catchup scan reservation prior to attemt to re-acquire it.  Failure to do so could result in a stuck mux rangefeed when enough ranges encounter an error, such as range split, prior to receiving the first checkpoint event, that would cause additional attempts to acquire catchup scan quota.

Fixes #105058

Release note (bug fix): Fix a bug in mux rangefeed implementation that may cause mux rangefeed to become stuck if enough ranges encounter certain error concurrently.

Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
@exalate-issue-sync exalate-issue-sync bot added T-kv KV Team and removed T-kv-replication labels Jun 28, 2024
@blathers-crl blathers-crl bot added this to Incoming in KV Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
KV
Incoming
Development

No branches or pull requests

3 participants