ElasticSearch Configuration

grigorescu edited this page Jun 27, 2012 · 2 revisions
Clone this wiki locally

ElasticSearch Configuration

Resource Limits

ElasticSearch has the best performance when it has a lot of resources (open files and RAM) available.

#/etc/security/limits.conf

elasticsearch - nofile 65535
elasticsearch - memlock unlimited

Verification:

sudo -u elasticsearch -s "ulimit -Sn"

Configuration

cluster.name: myFirstElasticsearchCluster
node.name: $hostname

index.number_of_shards: 4 x $numberOfNodes
index.number_of_replicas: 0  # See explanation below
bootstrap.mlockall: true

path.data: /data/elasticsearch # Additional directories can be added here, comma-delimited.
                               # Data will be striped across the directories

http.max_content_length: 256mb

Number of Replicas

A replica is a copy of the data. If a node goes down, having replicas allows you to continue running, and the cluster will rebalance itself automatically. Similarly, if your cluster is slower than you would like, you can add an additional node, and the load will be rebalanced.

More replicas - slower indexing, faster searching, increased reliability. Fewer replicas - faster indexing, slower searching, decreased reliability.

Generally, it's a good idea to have as many replicas as you can. 2 replicas will use 3x the disk, though (the original + 2 copies).

NOTE: You can modify the number of replicas on the fly. You can also modify it per-index, so your current index can have fast indexing, slow searching, while older indexes can be tuned for faster searching. This setting just configures the default whenever a new index is created.