A MongoDB to Elasticsearch connector
Branch: master
Clone or download
Latest commit 4f50f24 Jan 16, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
examples 3.3.1 Nov 20, 2018
src 3.3.0 add back whole oplog update Jul 23, 2018
test 3.2.8 Jun 22, 2018
.gitignore remove useless var Jun 20, 2017
.npmignore prepare for publishing May 13, 2017
.npmrc 3.2.8 Jun 22, 2018
.prettierignore 3.3.1 Nov 20, 2018
.prettierrc 3.2.8 Jun 22, 2018
README.md 3.3.1 Nov 20, 2018
license 3.2.8 Jun 22, 2018
package-lock.json 3.3.2 Jan 16, 2019
package.json 3.3.2 Jan 16, 2019
tsconfig.json 3.3.1 Nov 20, 2018

README.md

Mongo-ES

A MongoDB to Elasticsearch connector

npm version

Installation

npm i -g mongo-es

Usage

Command Line

# normal mode
mongo-es ./config.json

# debug mode, with debug info printed
NODE_ENV=dev mongo-es ./config.json

Programmatically

const fs = require('fs')
const Redis = require('ioredis')
const { Config, Task, run } = require('mongo-es')

const redis = new Redis('localhost')

Task.onSaveCheckpoint((name, checkpoint) => {
  return redis.set(`mongo-es:${name}`, JSON.stringify(checkpoint))
})

// this will overwrite task.from in config file
Task.onLoadCheckpoint(name => {
  return redis.get(`mongo-es:${name}`).then(JSON.parse)
})

run(new Config(fs.readFileSync('config.json', 'utf8')))

Concepts

Scan phase

scan entire database for existed documents

Tail phase

tail the oplog for documents' create, update or delete

Configuration

Structure:

{
  "controls": {},
  "mongodb": {},
  "elasticsearch": {},
  "tasks": [
    {
      "extract": {},
      "transform": {},
      "load": {}
    }
  ]
}

Detail example

controls

  • mongodbReadCapacity - Max docs read per second (default: 10000). (optional)
  • elasticsearchBulkInterval - Max bluk interval per request (default: 5000). (optional)
  • elasticsearchBulkSize - Max bluk size per request (default: 5000). (optional)
  • indexNameSuffix - Index name suffix, for index version control. (optional)

mongodb

  • url - The connection URI string, eg: mongodb://user:password@localhost:27017/db?replicaSet=rs0. notice: must use a admin user to access oplog.
  • options - Connection settings, see: MongoClient. (optional)

elasticsearch

  • options - Elasticsearch Config Options, see: Configuration.
  • indices - If set, auto create indices when program start, see: Indeces Create. (optional)

task.from

  • phase - scan or tail
  • time - tail oplog with query: { ts: { $gte: new Timestamp(0, new Date(time).getTime() / 1000) } }
  • id - scan collection with query { _id: { $gte: id }}

task.extract

  • db - Database name.
  • collection - Collection name in database.
  • projection - Projection selector, see Projection.

task.transform

  • mapping - The field mapping from mongodb's collection to elasticsearch's index.
  • parent - The field in mongodb's collection to use as the _parent in elasticsearch's index. (optional)

task.load

  • index - The name of the index.
  • type - The name of the document type.
  • body - The request body, see Put Mapping.

License

Mozilla Public License Version 2.0