Skip to content

How bulk indexing is used by the JDBC river

Jörg Prante edited this page Feb 25, 2014 · 1 revision

Bulk indexing is automatically used in order to speed up the indexing process. Elasticsearch provides a BulkProcessor to support asynchronous handling of bulk messages.

The following river parameters can control the bulk indexing:

bulk_size - the maximum number of documents in a single bulk request (default: 100)

max_bulk_requests - the maximum concurrent bulk requests (default: 30)

bulk_flush_interval - time period to flush open bulk requests (default: 5s)

Example:

curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
    "type" : "jdbc",
    "jdbc" : {
        "driver" : "com.mysql.jdbc.Driver",
        "url" : "jdbc:mysql://localhost:3306/test",
        "user" : "",
        "password" : "",
        "sql" : "select * from orders"
        "index" : "jdbc_index",
        "type" : "jdbc_type",
        "bulk_size" : 1000,
        "max_bulk_requests" : 50,
        "bulk_flush_interval" : "10s"
    }
}'