-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data model for storing revision history in FoundationDB #1957
Comments
If one were inclined to save a byte it would be easy to combine |
a) I love it. b) do we really need to let revs limit go above 1000? In fact, should we allow this to be edited at all? I note that we didn't finish the RFC thread but the last comment was to add a "security considerations" section, so if you would add that it's appreciated. I suspect it will be short. |
Thanks! I added the Security Considerations section, and tweaked the Advantages section to replace a redundant description of the write path behavior with a description of the read path. |
Sorry about that - the RFC PR is going to be merged shortly: #1914 |
@davisp pointed out something quite important in a discussion on IRC that I want to capture here. The We talked about a way to address this, which I’ll leave first in the comment here. We use the fact that a new edit branch can only be created by a If a writer comes in and tries to extend a losing edit branch, it will find the A writer attempting to delete the winning branch (i.e., setting A writer extending the winning branch with an updated document (the common case) will proceed as before with no loss in efficiency. I’ve tried to think through all possible concurrency issues here but I think that FoundationDB’s transaction isolation model delivers the goods in every case. As far as the data model is concerned, the only change is that the Versionstamp is stored exclusively in the winning edit branch KV at all times, with all other branches having a null byte there instead. Also we should rename the field as it’s really |
I updated the text of the proposal to incorporate the details from my last comment. Separately, @wohali pointed me to an old bug report in #1418 which has some bearing here. It seems that we are currently rather inconsistent in how we handle attempts to extend a tombstoned edit branch. If every branch in the document is tombstoned, we reject edit attempts that specify an explicit The proposed data model is most efficient if we block all explicit updates to tombstoned edit branches (i.e., updates that supply the Digging even deeper, folks should be aware that when users create a new document (with no |
Some discussion of whether it makes more sense to model RFCs as PRs against the documentation repo. Merits to both options. I filed the PR version of this issue at apache/couchdb-documentation#397 |
general agreement on couchdb-dev (IRC) to use the PR approach as it allows clarifying commits, reviews and a trail of approval. |
Introduction
This is a proposal for the storage of document revision history metadata as a set of KVs in FoundationDB.
Abstract
This design stores each edit branch as its own KV, and all of the edit branches are stored separately from the actual document data. Document reads can avoid retrieving this information, while writes can avoid retrieving the document body.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Terminology
Detailed Description
The size limits in FoundationDB preclude storing the entire revision tree as a single value; in pathological situations the tree could exceed 100KB. Rather, we propose to store each edit branch as a separate KV. Specifically, we create a "revisions" subdirectory in each database directory to store the revision trees with keys and values that look like
(“revisions”, DocID, NotDeleted, RevFormat, RevPosition, RevHash) = (Versionstamp, [ParentRev, GrandparentRev, …])
where the individual elements of the key and value are defined as follows:
DocID
: the document IDNotDeleted
:\x00
if the leaf of the edit branch is deleted,\x01
otherwiseRevFormat
: enum for the revision encoding being used, start at\x01
with this proposalRevPosition
: positive integer encoded using standard tuple layer encoding (signed, variable-length, order-preserving)RevHash
: 16 bytes uniquely identifying this revisionVersionstamp
: the FoundationDB versionstamp associated with the last transaction that modified the document (NB: not necessarily the last edit to this branch).[ParentRev, GrandparentRev, ...]
: 16 byte identifiers of ancestors, up to 1000 by defaultLimits
In order to stay compatible with FoundationDB size limits we need to prevent administrators from increasing
_revs_limit
beyond what we can fit into a single value. Suggest 4000 as a max.Update Path
Multiple edit branches on a document are largely independent of one another in this design, but some coordination is required around the
Versionstamp
. Recall that the_changes
feed includes each document exactly once, so we do not want to be able to extend different edit branches in parallel and end up adding both stamps to the feed. We address this by storing theVersionstamp
only on the so-called "winning" branch. Other branches set this to null.If a writer comes in and tries to extend a losing edit branch, it will find the
Versionstamp
to be null and will do an additional edit branch read to retrieve the winning branch. It can then compare both branches to see which one will be the winner following that edit, and can assign the newVersionstamp
to that branch accordingly.A writer attempting to delete the winning branch (i.e., setting
NotDeleted
to 0) will need to read two contiguous KVs, the one for the winner and the one right before it. If the branch before it will be the winner following the deletion then we move the storage of the newVersionstamp
to it accordingly. If the tombstoned branch remains the winner for this document then we only update that branch.A writer extending the winning branch with an updated document (the common case) will proceed reading just the one branch.
Summarizing the performance profile:
new_edits=false
update:<N>
KVs, 1 roundtripAdvantages
We can read a document revision without retrieving the revision tree, which in the case of frequently-edited documents may be larger than the doc itself.
We ensure that an interactive document update against the winning branch only needs to read the edit branch KV against which the update is being applied, and it can read that branch immediately knowing only the content of the edit that is being attempted (i.e., it does not need to read the current version of the document itself). The less common scenario of updating a losing branch is only slightly less efficient, requiring two roundtrips.
Interactively updating a document with a large number of edit branches is therefore dramatically cheaper, as no more than two edit branches are read or modified regardless of the number of branches that exist, and no tree merge logic is required.
Including
NotDeleted
in the key ensures that we can efficiently accept the case where we upload a new document with the same ID where all previous edit branches have been deleted; i.e. we can construct a key selector which automatically tells us there are nodeleted=false
edit branches.The
RevFormat
enum gives us the ability to evolve revision history storage over time, and to support alternative conflict resolution policies like Last Writer Wins.Access to
Versionstamp
ensures we can clear the old entry in theby_seq
space during an edit. Theset_versionstamped_value
API is used to store this value automatically.The key structure above naturally sorts so that the "winning" revision is the last one in the list, which we leverage when deleting the winning edit branch (and thus promoting the one next in line), and extending a conflict branch (to coordinate the update to the
Versionstamp
) This is also a small optimization for reads with?revs=true
or?revs_info=true
, where we want the details of the winning edit branch but don't actually know theRevPosition
andRevHash
of that branch.Disadvantages
Historical revision identifiers shared by multiple edit branches are duplicated.
Key Changes
Administrators cannot set
_revs_limit
larger than 4,000 (previously unlimited?). Default stays the same at 1,000.The intention with this data model is that an interactive edit that supplies a revision identifier of a deleted leaf will always fail with a conflict. This is a subtle departure from CouchDB 2.3 behavior, where an attempt to extend a deleted edit branch can succeed if some other
deleted=false
edit branch exists. This is an undocumented and seemingly unintentional behavior. If we need to match that behavior it will require reading 3 KVs in 2 roundtrips for every edit that we reject with a conflict.Modules affected
TBD depending on exact code layout going forward, but the
couch_key_tree
module contains the current revision tree implementation.HTTP API additions
None.
HTTP API deprecations
None.
Security Considerations
None have been identified.
References
Original mailing list discussion
Acknowledgements
Thanks to @iilyak, @davisp, @janl, @garrensmith and @rnewson for comments on the mailing list discussion.
The text was updated successfully, but these errors were encountered: