Skip to content

Usage with MongoDB

Luke Lovett edited this page Jul 28, 2015 · 8 revisions

The Basics

Mongo Connector can replicate from one MongoDB replica set or sharded cluster to another using the Mongo DocManager. The most basic usage is like the following:

mongo-connector -m localhost:27017 -t localhost:37017 -d mongo_doc_manager

old usage (before 2.0 release):

mongo-connector -m localhost:27017 -t localhost:37017 -d <your-doc-manager-folder>/mongo_doc_manager.py

This assumes you are running a replica set or sharded cluster on ports 27017 and 37017 of the local machine.

Either of these URLs (in the arguments to -m and -t) can point to a sharded cluster or a replica set. The usage for a sharded cluster is exactly the same as for a replica set; just provide a connection string pointing to a mongos instead of a replica set member.

Comparison to Other Tools

MongoDB comes with several other tools that can be helpful in certain situations where Mongo Connector may also apply. These tools include:

For backup purposes, these tools work fine and are probably a lot faster than Mongo Connector. Furthermore, MongoDB Inc. officially supports their use (mongo-connector is not "officially supported"), and they may have fewer bugs. It's even possible to backup or move data from one MongoDB cluster to another without downtime using filesystem snapshots and mongooplog. However, there are certain situations where Mongo Connector really excels. Some of these are:

  • Needing to replicate to a system other than MongoDB
  • Needing to backup or move data from a MongoDB cluster without downtime, and filesystem snapshots aren't an option
  • Targeting specific namespaces for live replication
  • Replicating to multiple targets with one tool
  • Migrating databases or collections to have different names without downtime

The take-away: Consider your options first before committing to a solution for just moving data around.

Using mongodump and mongo-connector together.

It's possible to speed up the initial import process by using mongodump/mongorestore (see above) to first seed the documents already in the source system, then continue tailing the oplog with mongo-connector. Here's how to do this:

  1. Generate a mongo-connector timestamp file:
    • Run mongo-connector --no-dump.
    • Stop mongo-connector right after it starts up. Now you have an oplog.timestamp file pointing to the latest entry on the oplog.
  2. Run mongodump on the primary. The dump already reflects all the changes that mongo-connector saw in the oplog.
  3. Run mongorestore with the dump from (2) on the target MongoDB.
  4. Restart mongo-connector. Pass in the file generated in (1) to the --oplog-ts option.