Elasticsearch - change number of shards for index template

richm · Jul 26, 2018 · c9109b2 · c9109b2
1 parent 9abe0ab
commit c9109b2
Showing 1 changed file with 123 additions and 0 deletions.
diff --git a/increase-number-of-shards-for-index-template.md b/increase-number-of-shards-for-index-template.md
@@ -0,0 +1,123 @@
+Elasticsearch - change number of shards for index template
+==========================================================
+
+Intro
+-----
+
+These instructions are primarily for OpenShift logging but should apply to any
+Elasticsearch installation by removing the OpenShift specific bits.  They also
+apply to Elasticsearch 2.x for OpenShift 3.4 -> 3.10, so may require
+some tweaking to work with ES 5.x.  The instructions assume your logging
+namespace is `logging` - use `openshift-logging` with OpenShift 3.10 and later.
+
+The default number of shards per index for OpenShift logging is 1, which is by
+design not to break very large deployments with a large number of indices,
+where the problem is having too many shards.  However, for deployments with a
+small number of very large indices, this can be problematic.  Elasticsearch
+recommends keeping shard size under 50GB, so increasing the number of shards
+per index can help with that.
+
+Steps
+-----
+
+Identify the index pattern you want to increase sharding for.  For
+OpenShift logging this will be `.operations.*` or `project.*`.  If there are
+specific projects that typically generate much more data than others, and you
+need to keep the number of shards down, you can shard very specific patterns
+e.g. `project.this-project-generates-too-many-logs.*`.  If you don't anticipate
+having many namespaces/project/indices, you can just use `project.*`.
+
+Create a JSON file for each index pattern, like this:
+
+    {
+        "order": 20,
+        "settings": {
+            "index": {
+                "number_of_shards": 3
+            }
+        },
+        "template": ".operations.*"
+    }
+
+Call this one `more-shards-for-operations-indices.json`.
+
+    {
+        "order": 20,
+        "settings": {
+            "index": {
+                "number_of_shards": 3
+            }
+        },
+        "template": "project.*"
+    }
+
+Call this one `more-shards-for-project-indices.json`.
+
+Load these into Elasticsearch.  You'll need the name of one of the Elasticsearch
+pods:
+
+    oc get -n logging pods -l component=es
+
+Pick one and call it `$espod`.  If you have a separate OPS cluster, you'll need
+to identify one of the es-ops Elasticsearch pods too, for the `.operations.*`
+indices:
+
+    oc get -n logging pods -l component=es-ops
+
+Pick one and call it `$esopspod`.
+
+Load the file `more-shards-for-project-indices.json` into `$espod`:
+
+    file=more-shards-for-project-indices.json
+    cat $file | \
+    oc exec -n logging -i -c elasticsearch $espod -- \
+        curl -s -k --cert /etc/elasticsearch/secret/admin-cert \
+        --key /etc/elasticsearch/secret/admin-key \
+        https://localhost:9200/_template/$file -XPUT -d@- | \
+    python -mjson.tool
+
+Load the file `more-shards-for-operations-indices.json` into `$esopspod`, or
+`$espod` if you do not have a separate OPS cluster:
+
+    file=more-shards-for-operations-indices.json
+    cat $file | \
+    oc exec -n logging -i -c elasticsearch $esopspod -- \
+        curl -s -k --cert /etc/elasticsearch/secret/admin-cert \
+        --key /etc/elasticsearch/secret/admin-key \
+        https://localhost:9200/_template/$file -XPUT -d@- | \
+    python -mjson.tool
+
+**NOTE** The settings will not apply to existing indices.  You will need to
+perform a reindexing for that to work.  However, it is usually not a problem,
+as the settings will apply to new indices, and curator will eventually delete
+the old ones.
+
+Results
+-------
+
+To see if this is working, wait until new indices are created, and use the
+`_cat` endpoints to view the new indices/shards:
+
+    oc exec -c elasticsearch $espod -- \
+    curl -s -k --cert /etc/elasticsearch/secret/admin-cert \
+        --key /etc/elasticsearch/secret/admin-key \
+        https://localhost:9200/_cat/indices?v
+    health status index                                                                        pri rep docs.count docs.deleted store.size pri.store.size 
+    green  open   project.kube-service-catalog.d5dbe052-903c-11e8-8c22-fa163e6e12b8.2018.07.26   3   0       1395            0      2.2mb          2.2mb
+
+The `pri` value is now `3` instead of the default `1`.  This means there are 3
+shards for this index.  You can also check the `shards` endpoint:
+
+    oc exec -c elasticsearch $espod -- \
+    curl -s -k --cert /etc/elasticsearch/secret/admin-cert \
+        --key /etc/elasticsearch/secret/admin-key \
+        https://localhost:9200/_cat/shards?v
+    index                                                                        shard prirep state     docs   store ip         node                            
+
+    project.kube-service-catalog.d5dbe052-903c-11e8-8c22-fa163e6e12b8.2018.07.26 1     p      STARTED    596 683.3kb 10.131.0.8 logging-es-data-master-vksc2fwe 
+    project.kube-service-catalog.d5dbe052-903c-11e8-8c22-fa163e6e12b8.2018.07.26 2     p      STARTED    590 652.6kb 10.131.0.8 logging-es-data-master-vksc2fwe 
+    project.kube-service-catalog.d5dbe052-903c-11e8-8c22-fa163e6e12b8.2018.07.26 0     p      STARTED    602 628.1kb 10.131.0.8 logging-es-data-master-vksc2fwe
+
+This lists the 3 shards for the index.  If you have multiple Elasticsearch
+nodes, you should see more than one node listed in the `node` column of the
+`_cat/shards` output.