Skip to content

Sending Minimal Histories

Jens Alfke edited this page May 11, 2016 · 1 revision

We're adding two parameters to the REST API, to allow the server to send shorter lists of revisions in responses.

Rationale

This can save a lot of bandwidth during replication, since Sync Gateway stores a pretty long revision history (1000 revisions, by default.) For example, if a client pulls a revision with generation 1752, the _revisions property in the response currently contains 1000 revision IDs (at 40 bytes each.) In the common case that the client already has the parent revision, 998 of those IDs are unnecessary. Even if the client doesn't have that document, it's probably only going to store the last 20 revision IDs, so 980 of the IDs sent are unnecessary.

We need a way for the client to tell the server

  • Where to stop the revision history (i.e. which revisions the client already has)
  • How many revisions the client cares about

(This optimization already exists going the opposite direction (push): The server's response to a _revs_diff includes for each revision a possible_ancestors property that lets the client trim the history it sends. But for some reason this was never implemented for pull replications. That's what we're adding.)

Compatibility

These are implicitly backward-compatible in both directions:

  • If the client doesn't send them, the server's behavior is unchanged: the client gets the full history as expected.
  • If the server doesn't understand them, its behavior is unchanged: it sends the full history, and the client can ignore the excess just like today.

API Additions

Individual Document Read (GET /$db/$docid)

Two new URL query parameters:

revs_from: Value is a URL-encoded JSON array of revision ID strings (just like atts_since.) If present, this identifies a set of revisions already known to the client. The document revision history returned in the _revisions property should be trimmed to stop at (not before) any of these revision IDs.
To avoid duplication, if this property is missing it defaults to the same value as atts_since.

revs_limit: Value is a non-negative integer. The server should limit the _revisions property to this number of revisions, unless a match was found in the revs_from or atts_since. A value of zero means "no limit".

Bulk Get (POST /$db/_bulk_get)

One new URL query parameter:

revs_limit: Value is a non-negative integer. The server should limit the _revisions property to this number of revisions, unless a match was found in an revs_from or atts_since. A value of zero means "no limit".

Two new properties in an individual revision request object:

revs_from: Value is a JSON array of revision ID strings (just like atts_since.) If present, this identifies a set of revisions already known to the client. The document revision history returned in the _revisions property should be trimmed to stop at (not before) any of these revision IDs.
To avoid duplication, if this property is missing it defaults to the same value as atts_since.

revs_limit: Value is a positive integer. Overrides the value given in the URL, for this revision only.

Status

As of 11 May 2016, this has been implemented in Couchbase Sync Gateway and is being reviewed (see #1752, #1764.) It is expected to appear in the 1.3 release.

Clone this wiki locally