Skip to content

Resyncing the Connector

Luke Lovett edited this page Jun 7, 2016 · 3 revisions

This page describes when and how to re-sync mongo-connector. You'll need to do this if you see the error message Last entry no longer in oplog cannot recover!. The most common reason to need to re-sync mongo-connector is that it couldn't replicate operations from the oplog fast enough. This can happen when there is a lot of write activity happening in MongoDB, such as when using mongoimport. Because the oplog is a capped collection, older records are overwritten when the collection is full.

Avoiding Oplog Rollover

Mongo-connector can be more tolerant to short bursts of high write activity by increasing the oplog size in MongoDB. The greater oplog time allows mongo-connector to "catch up" when there is less write activity.

How to Perform a Re-Sync

The only way to ensure that the data in your external system is consistent with what is in MongoDB is to delete and re-index all documents in the target. After all data is removed, you may delete the oplog progress file (usually called "oplog.timestamp") and re-start mongo-connector. Mongo-connector will then perform a collection dump, re-indexing all your data. Be careful and double-check that you are deleting only and exactly what you mean to delete.

MongoDB

The simplest and fastest way to remove data from MongoDB is to drop the database:

mongo
> db.getSisterDB("<database name>").dropDatabase()
{ "dropped" : "<database name>", "ok" : 1 }

Or only drop a collection:

> db.getSisterDB("<database name>").<collection name>.drop()
true

Solr

You can remove all data by sending a GET request to a URL:

http://<hostname>:<port>/solr/<core name>/update?commit=true&stream.body=<delete><query>*:*</query></delete>

Elasticsearch

You can remove all data quickly and efficiently by deleting the index and re-creating it:

curl -XDELETE http://<hostname>:<port>/<index name>
curl -XPUT http://<hostname>:<port>/<index name>

After this, you should refresh the index to make these changes visible:

curl -XPOST http://<hostname>:<port>/<index name>/_refresh

Alternatives

There aren't any other methods to restore a consistent state with the source MongoDB replica set or cluster. However, you can get mongo-connector simply running again by deleting the oplog progress file and restarting mongo-connector. This causes mongo-connector to perform a collection dump, re-saving the latest versions of all documents, then start tailing the oplog. This does not bring your target to a consistent state but may be suitable for pure insert/update use cases. If any delete operations were clobbered by the oplog collection rollover, mongo-connector cannot catch them without a proper re-sync (described above).