A small script written in Python used to index a CouchDB 1.x database with Elasticsearch - blog announcement.
The ideea of this script is inspired from couchdb rivers.
The python program require an ini file, created like that:
#
# Config file for indexing CouchDB with Elasticsearch
#
[ES]
index_name = db1
type_name = test1
[CouchDB]
dbname = db1
dbindex = index
index_doc_seq = db1
bulk_size = 1000
[app]
#seconds
delay = 1
verbose = True
Section [ES] is for Elasticsearch.
Section [CouchDB]:
dbname = which database is indexed
dbindex = in which database to store last sequence processed
index_doc_seq = name of document which keep last sequence processed
bulk_size = size of batch used to read from CouchDB and to write to ES
Section [app] keep some generic settings, after last sequence is processed, program wait for a while before to check for changes.
I used this python script to index a CouchDB database with 700 millions records:
In case of errors (communication, LAN, memory and so on) the script restart and continue indexing:
Feel free to use this software for both personal and commercial usage.