Skip to content

Commit

Permalink
DCP docs: Add 'design discussion' page
Browse files Browse the repository at this point in the history
Add a new 'DCP Design Discussion' page, to cover additional details on
why DCP works in certain ways, and gives examples of how it can break
if certain rules are not followed.

The page initially has details on why snapshot start/end must be
passed correctly when resuming a stream.

Change-Id: Ie5a48ecd8a9d1d79444c6a02bbf61085e68a0565
Reviewed-on: https://review.couchbase.org/c/kv_engine/+/183611
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>
Tested-by: Dave Rigby <daver@couchbase.com>
  • Loading branch information
daverigby committed Dec 15, 2022
1 parent 7d2cbad commit 61f333b
Show file tree
Hide file tree
Showing 3 changed files with 98 additions and 10 deletions.
1 change: 1 addition & 0 deletions docs/dcp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,4 @@ The Database Change Protocol (DCP) a protocol used by Couchbase for moving large
* [Upgrade (2.x to 3.x)](documentation/upgrade.md)
* [Future Work](documentation/future-work.md)
* [Change Log](documentation/changelog.md)
* [Design Discussions](documentation/discussion.md)
30 changes: 20 additions & 10 deletions docs/dcp/documentation/building-a-simple-client.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,13 @@ Once the connection is created, the client can send one or more [control](comman

Once you have a connection established with the server then the next thing to do is to open a stream to the server to stream out data for a specific VBucket. For a basic client the simplest thing to do is to always stream data starting with the first mutation that was received in the VBucket. To do this the Consumer should send [Stream Request](commands/stream-request.md) messages for each VBucket that it wants to recieve data for.

* VBucket - Set this to the VBucket ID that you want your client to receive data for. This number should always be between 0 and 1023 inclusive.
* Flags - The flags field is used to define specialized behavior for a stream. Since we don't need any specialized behavior we set the flags field to 0.
* Start Seqno - Should be set to 0 since sequence numbers are assigned starting at sequence number 1. The Start sequence number is the last sequence number that the Consumer received and since for our basic streaming case we want to always start from the beginning we send 0 in this field.
* End Seqno - For our basic client we want to recieve a continuous stream and receive all data as in enters Couchbase Server. To do this the highest sequence number possible should be specified. In this case the end sequence number should be 2^64-1.
* VBucket UUID - Since we that are starting to recieve data for the first time 0 should be specified.
* Snapshot Start Seqno - Since we that are starting to recieve data for the first time 0 should be specified.
* Snapshot End Seqno - Since we that are starting to recieve data for the first time 0 should be specified.
* `VBucket` - Set this to the VBucket ID that you want your client to receive data for. This number should always be between 0 and 1023 inclusive.
* `Flags` - The flags field is used to define specialized behavior for a stream. Since we don't need any specialized behavior we set the flags field to 0.
* `Start Seqno` - Should be set to 0 since sequence numbers are assigned starting at sequence number 1. The Start sequence number is the last sequence number that the Consumer received and since for our basic streaming case we want to always start from the beginning we send 0 in this field.
* `End Seqno` - For our basic client we want to receive a continuous stream and receive all data as in enters Couchbase Server. To do this the highest sequence number possible should be specified. In this case the end sequence number should be 2^64-1.
* `VBucket UUID` - Since we that are starting to recieve data for the first time 0 should be specified.
* `Snapshot Start Seqno` - Since we that are starting to recieve data for the first time 0 should be specified.
* `Snapshot End Seqno` - Since we that are starting to recieve data for the first time 0 should be specified.


### Client Side State for a Stream
Expand All @@ -44,23 +44,33 @@ A DCP client has to maintain the following state variables for a stream.
* Last Snapshot Start Seqno
* Last Snapshot End Seqno

Everytime a stream start or re-starts and the server decides to continue based on the parameters passed by the client (that is it does not decide on the rollback), the server sends over the failover log to the client. The client should replace its previous failover log with the new failover log sent by the server. This is because DCP is a master-slave protocol where all the slaves (DCP clients) follow the master (active vbucket on the server) get eventually consistent data.
Everytime a stream starts or re-starts and the server decides to continue based on the parameters passed by the client (that is it does not decide on the rollback), the server sends over the failover log to the client. The client should replace its previous failover log with the new failover log sent by the server, so it can record the current "timeline" it is on.

Last Snapshot Start Seqno, Last Snapshot End Seqno, Last Recieved Seqno can keep changing as the server keeps sending data on the stream. The client is supposed to save atleast the final copy of these 3 sequence numbers that it receives on the stream.
Last Snapshot Start Seqno, Last Snapshot End Seqno, Last Received Seqno can keep changing as the server keeps sending data on the stream. The client is supposed to save at least the final copy of these 3 sequence numbers that it receives on the stream.

Maintaining these state variables help in restarting from the point where the client had left off.

### Restarting from where you left off
A DCP stream can get dropped due to a number of reasons like drop in the connection, an error for that stream on the server side, an error for that stream on client, etc. So it is quite common for the stream to re-start.

Resumability upon restart of a DCP stream is decided through the use of client side state variables. Upon every start or re-start, the client should sent latest VBucket UUID from the failover log, Last Recieved Seqno as Start Seqno, Last Snapshot Start Seqno as Snapshot Start Seqno and Last Snapshot End Seqno as Snapshot End Seqno. A correct request will have the below invariant
Resumability upon restart of a DCP stream is decided through the use of client side state variables. Upon every start or re-start, the client should issue a [Stream Request](commands/stream-request.md) with the following updates to the original parameters:

* `Start Seqno` - Set to the last sequence number that the Consumer received.
* `End Seqno` - Set to whatever seqno the client wants to receive up to - typically 2^64-1 to stream everything
* `VBucket UUID` - Set to the latest vBucket UUID from the failover log.
* `Snapshot Start Seqno` - Set to the `start` seqno from the last received Snapshot Marker.
* `Snapshot End Seqno` - Set to the `end` seqno from the last received Snapshot Marker.

A correct request will have the below invariant:

Snap Start Seqno <= Start Seqno <= Snap End Seqno

The server decides to whether to resume from that start seqno or to ask client to rollback based on the [rollback logic](rollback.md).

If the server decides to resume the stream then the client has to maintain the state variables as explained in the [previous section](building-a-simple-client.md#client-side-state-for-a-stream). If the client is asked to rollback then it should do as explained in the [next section](building-a-simple-client.md#handling-a-rollback).

See [Discussion - Why does a client need to provide snapshot start & end seqnos](discussion.md#why-does-a-client-need-to-provide-snapshot-start--end-seqnos-to-resume-a-stream) for more details on why snapshot start and end are necessary for correct resumption.

### Handling a rollback
A rollback is necessary when a DCP client has a different history from the server for that particular vbucket. Clients can choose to handle the rollback in different ways.

Expand Down
77 changes: 77 additions & 0 deletions docs/dcp/documentation/discussion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# DCP Design Discussion

This document covers additional details on why DCP works in certain ways, and gives examples of how it can break if certain rules are not followed.

## Why does a client need to provide snapshot start & end seqnos to resume a stream?

Due to [Deduplication](concepts.md#deduplication), the server needs to know the extent of the snapshot which the consumer was last sent to ensure the consumer doesn't miss any mutations on resumption.

Without snapshot start and end seqnos, the server doesn't know if the consumer is at a consistent point or not, nor what the last consistent point for that client was.

**Example 1** - the server needs to know which "timeline" the consumer' mutations came from, so in the event of the consumer resuming after a server restart, the server can correctly determine if the consumer needs to rollback or not.

Assuming the state of the vbucket's checkpoint manger is:

```
Seqno [1] 2 3 4
Key SET(A) SET(B) SET(C) DEL(A)
```
_`[N]` = mutation has been deduplicated_

, consider the following sequence of events:

1. Consumer establishes a stream and gets vb UUID=`AAAA`.
2. Consumer receives a Snapshot `start=1`, `end=4`.
3. Consumer receives 2 mutations - `2:SET(B)` and `3:SET(C)`
4. Consumer disconnects, server restarts and generates a new vb UUID=`BBBB` at seqno 3.
5. Consumer reconnects, _incorrectly_ sets `start=3`, `snapshot_start=3`, `snapshot_end=3`.
6. Server will perform rollback check and **incorrectly conclude that the consumer doesn't need to rollback** - as the vBucket branched _after_ the snapshot the consumer received.

This could result in lost DCP messages. The server would have sent the following messages to the consumer:

* `SnapshotMarker(start=1, end=4)`
* `2:SET(B)`
* `3:SET(C)`
* `4:DEL(A)`

However, the consumer only received the Snapshot Marker, `2:SET(B)` and `3:SET(C)`. After the consumer disconnects / server restarts, the checkpoint manager has lost `4:DEL(A)` (as it was never persisted to disk) and a vb UUID branch point is created at the last common point:

```
Seqno [1] 2 3 4
Key SET(A) SET(B) SET(C) <<new mutations...>
|
UUID:BBBB
```

i.e. `1:SET(A)` is no longer deduplicated, as `4:DEL(A)` was lost in the restart. However, as the consumer has incorrectly told the server it had a complete snapshot from _before_ the UUID changed, then the server thinks it can send from seqno 3 onwards. *This would result in document A never being sent to the DCP consumer*.

This can be demonstrated using `humpty-dumpty` to test different failover scenarios. First, the _correct_ resume request for this scenario:

```
$ echo "1111 1 4 3" | ./humpty_dumpty failover.json 10 0
Simulating behaviour of VBucket with highSeqno: 10, purgeSeqno:0, failoverTable:
[
{"id":2222,"seq":3}
{"id":1111,"seq":0}
]
Testing UUID:1111 snapshot:{1,4} start:3
Rollback:true
Requested rollback seqno:1
Reason: consumer ahead of producer - producer upper at 3
```

Incorrectly specifying snapshot start and end as 3 does not rollback as it should:

````
$ echo "1111 3 3 3" | ./humpty_dumpty failover.json 10 0
Simulating behaviour of VBucket with highSeqno: 10, purgeSeqno:0, failoverTable:
[
{"id":2222,"seq":3}
{"id":1111,"seq":0}
]
Testing UUID:1111 snapshot:{3,3} start:3
Rollback:false
```

0 comments on commit 61f333b

Please sign in to comment.