MongoDB database adapter for ShareDB
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

sharedb-mongo

NPM Version Build Status Coverage Status

MongoDB database adapter for sharedb. This driver can be used both as a snapshot store and oplog.

Snapshots are stored where you'd expect (the named collection with _id=id). In addition, operations are stored in o_COLLECTION. For example, if you have a users collection, the operations are stored in o_users.

JSON document snapshots in sharedb-mongo are unwrapped so you can use mongo queries directly against JSON documents. (They just have some extra fields in the form of _v and _type). It is safe to query documents directly with the MongoDB driver or command line. Any read only mongo features, including find, aggregate, and map reduce are safe to perform concurrent with ShareDB.

However, you must always use ShareDB to edit documents. Never use the MongoDB driver or command line to directly modify any documents that ShareDB might create or edit. ShareDB must be used to properly persist operations together with snapshots.

Usage

sharedb-mongo wraps native mongodb, and it supports the same configuration options.

There are two ways to instantiate a sharedb-mongo wrapper:

  1. The simplest way is to invoke the module and pass in your mongo DB arguments as arguments to the module function. For example:
const db = require('sharedb-mongo')('mongodb://localhost:27017/test');
const backend = new ShareDB({db});
  1. If you'd like to reuse a mongo db connection or handle mongo driver instantiation yourself, you can pass in a function that calls back with a mongo instance.
const mongodb = require('mongodb');
const db = require('sharedb-mongo')({mongo: function(callback) {
  mongodb.connect('mongodb://localhost:27017/test', callback);
}});
const backend = new ShareDB({db});

Queries

In ShareDB, queries are represented as single JavaScript objects. But Mongo exposes methods on collections and cursors such as mapReduce, sort or count. These are encoded into ShareDBMongo's query object format through special $-prefixed keys that are interpreted and stripped out of the query before being passed into Mongo's find method.

Here are some examples:

MongoDB query code ShareDBMongo query object
coll.find({x: 1, y: {$ne: 2}}) {x: 1, y: {$ne: 2}}
coll.find({$or: [{x: 1}, {y: 1}]) {$or: [{x: 1}, {y: 1}]}}
coll.mapReduce({map: ..., reduce: ...}) {$mapReduce: {map: ..., reduce: ...}
coll.find({x: 1}).sort({y: -1}) {x: 1, $sort: {y: -1}}
coll.find().limit(5).count({applySkipLimit: true}) {x: 1, $limit: 5, $count: {applySkipLimit: true}}

Most of Mongo 3.2's collection and cursor methods are supported. Methods calls map to query properties whose key is the method name prefixed by $ and value is the argument passed to the method. $readPref is an exception -- it takes an object with mode and tagSet fields which map to the two arguments passed into the readPref method.

For a full list of supported collection and cursor methods, see collectionOperationsMap, cursorTransformsMap and cursorOperationsMap in index.js

getOps without strict linking

There is a getOpsWithoutStrictLinking flag, which can be set to true to speed up getOps under certain circumstances, but with potential risks to the integrity of the results. Read below for more detail.

Introduction

ShareDB has to deal with concurrency issues. In particular, here we discuss the issue of submitting multiple competing ops against a version of a document.

For example, if I have a version of a document at v1, and I simultaneously submit two ops (from different servers, say) against this snapshot, then we need to handle the fact that only one of these ops can be accepted as canonical and applied to the snapshot.

This issue is dealt with through optimistic locking. Even if you are only asking for a subset of the ops, under the default behaviour, getOps will fetch all the ops up to the current version.

Optimistic locking and linked ops

sharedb-mongo deals with its concurrency issue with multiple op submissions with optimistic locking. Here's an example of its behaviour:

  • my doc exists at v1
  • two simultaneous v1 ops are submitted to ShareDB
  • both ops are committed to the database
  • one op is applied to the snapshot, and the updated snapshot is written to the database
  • the second op finds that its updated snapshot conflicts with the committed snapshot, and the snapshot is rejected, but the committed op remains in the database

In reality, sharedb-mongo attempts to clean up this failed op, but there's still the small chance that the server crashes before it can do so, meaning that we may have multiple ops lingering in the database with the same version.

Because some non-canonical ops may exist in the database, we cannot just perform a naive fetch of all the ops associated with a document, because it may return multiple ops with the same version (where one was successfully applied, and one was not).

In order to return a valid set of canonical ops, the optimistic locking has a notion of linked ops. That is, each op will point back to the op that it built on top of, and ultimately the current snapshot points to the op that committed it to the database.

Because of this, we can work backwards from the current snapshot, following the trail of op links all the way back to get a chain of canonical, valid, linked ops. This way, even if a spurious op exists in the database, no other op will point to it, and it will be correctly ignored.

This approach has a big down-side: it forces us to fetch all the ops up to the current version. This might be fine if you want all ops, or are fetching very recent ops, but can have a large impact on performance if you only want ops 1-10 of a 10,000 op document, because you actually have to fetch all the ops.

Dropping strict linking

In order to speed up the performance of getOps, you can set getOpsWithoutStrictLinking: true. This will attempt to fetch the bare minimum ops, whilst still trying to maintain op integrity.

The assumption that underpins this approach is that any op that exists with a unique combination of d (document ID) and v (version), is a valid op. In other words, it had no conflicts and can be considered canonical.

Consider a document with some ops, including some spurious, failed ops:

  • v1: unique
  • v2: unique
  • v3: collision 3
  • v3: collision 3
  • v4: collision 4
  • v4: collision 4
  • v5: unique
  • v6: unique ...
  • v1000: unique

If I want to fetch ops v1-v3, then we:

  • look up v4
  • find that v4 is not unique
  • look up v5
  • see that v5 is unique and therefore assumed valid
  • look backwards from v5 for a chain of valid ops, avoiding the spurious commits for v4 and v3.
  • This way we don't need to fetch all the ops from v5 to the current version.

In the case where a valid op cannot be determined, we still fall back to fetching all ops and working backwards from the current version.

Limitations

Integrity

Attempting to infer a canonical op can be dangerous compared to simply following the valid op chain from the snapshot, which is - by definition - canonical.

This alternative behaviour should be safe, but should be used with caution, because we are attempting to infer a canonical op, which may have unforeseen corner cases that return an invalid set of ops.

This may be especially true if the ops are modified outside of sharedb-mongo (eg by setting a TTL, or manually updating them).

Recent ops

There are cases where this flag may slow down behaviour. In the case of attempting to fetch very recent ops, setting this flag may make extra database round-trips where fetching the snapshot would have been faster.

getOpsBulk and getOpsToSnapshot

This flag only applies to getOps, and not to the similar getOpsBulk and getOpsToSnapshot methods, whose performance will remain unchanged.

Error codes

Mongo errors are passed back directly. Additional error codes:

4100 -- Bad request - DB

  • 4101 -- Invalid op version
  • 4102 -- Invalid collection name
  • 4103 -- $where queries disabled
  • 4104 -- $mapReduce queries disabled
  • 4105 -- $aggregate queries disabled
  • 4106 -- $query property deprecated in queries
  • 4107 -- Malformed query operator
  • 4108 -- Only one collection operation allowed
  • 4109 -- Only one cursor operation allowed
  • 4110 -- Cursor methods can't run after collection method

5100 -- Internal error - DB

  • 5101 -- Already closed
  • 5102 -- Snapshot missing last operation field
  • 5103 -- Missing ops from requested version
  • 5104 -- Failed to parse query

MIT License

Copyright (c) 2015 by Joseph Gentle and Nate Smith

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.