Skip to content

Commit

Permalink
Elasticsearch - change number of shards for index template
Browse files Browse the repository at this point in the history
  • Loading branch information
richm committed Jul 26, 2018
1 parent 9abe0ab commit c9109b2
Showing 1 changed file with 123 additions and 0 deletions.
123 changes: 123 additions & 0 deletions increase-number-of-shards-for-index-template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
Elasticsearch - change number of shards for index template
==========================================================

Intro
-----

These instructions are primarily for OpenShift logging but should apply to any
Elasticsearch installation by removing the OpenShift specific bits. They also
apply to Elasticsearch 2.x for OpenShift 3.4 -> 3.10, so may require
some tweaking to work with ES 5.x. The instructions assume your logging
namespace is `logging` - use `openshift-logging` with OpenShift 3.10 and later.

The default number of shards per index for OpenShift logging is 1, which is by
design not to break very large deployments with a large number of indices,
where the problem is having too many shards. However, for deployments with a
small number of very large indices, this can be problematic. Elasticsearch
recommends keeping shard size under 50GB, so increasing the number of shards
per index can help with that.

Steps
-----

Identify the index pattern you want to increase sharding for. For
OpenShift logging this will be `.operations.*` or `project.*`. If there are
specific projects that typically generate much more data than others, and you
need to keep the number of shards down, you can shard very specific patterns
e.g. `project.this-project-generates-too-many-logs.*`. If you don't anticipate
having many namespaces/project/indices, you can just use `project.*`.

Create a JSON file for each index pattern, like this:

{
"order": 20,
"settings": {
"index": {
"number_of_shards": 3
}
},
"template": ".operations.*"
}

Call this one `more-shards-for-operations-indices.json`.

{
"order": 20,
"settings": {
"index": {
"number_of_shards": 3
}
},
"template": "project.*"
}

Call this one `more-shards-for-project-indices.json`.

Load these into Elasticsearch. You'll need the name of one of the Elasticsearch
pods:

oc get -n logging pods -l component=es

Pick one and call it `$espod`. If you have a separate OPS cluster, you'll need
to identify one of the es-ops Elasticsearch pods too, for the `.operations.*`
indices:

oc get -n logging pods -l component=es-ops

Pick one and call it `$esopspod`.

Load the file `more-shards-for-project-indices.json` into `$espod`:

file=more-shards-for-project-indices.json
cat $file | \
oc exec -n logging -i -c elasticsearch $espod -- \
curl -s -k --cert /etc/elasticsearch/secret/admin-cert \
--key /etc/elasticsearch/secret/admin-key \
https://localhost:9200/_template/$file -XPUT -d@- | \
python -mjson.tool

Load the file `more-shards-for-operations-indices.json` into `$esopspod`, or
`$espod` if you do not have a separate OPS cluster:

file=more-shards-for-operations-indices.json
cat $file | \
oc exec -n logging -i -c elasticsearch $esopspod -- \
curl -s -k --cert /etc/elasticsearch/secret/admin-cert \
--key /etc/elasticsearch/secret/admin-key \
https://localhost:9200/_template/$file -XPUT -d@- | \
python -mjson.tool

**NOTE** The settings will not apply to existing indices. You will need to
perform a reindexing for that to work. However, it is usually not a problem,
as the settings will apply to new indices, and curator will eventually delete
the old ones.

Results
-------

To see if this is working, wait until new indices are created, and use the
`_cat` endpoints to view the new indices/shards:

oc exec -c elasticsearch $espod -- \
curl -s -k --cert /etc/elasticsearch/secret/admin-cert \
--key /etc/elasticsearch/secret/admin-key \
https://localhost:9200/_cat/indices?v
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open project.kube-service-catalog.d5dbe052-903c-11e8-8c22-fa163e6e12b8.2018.07.26 3 0 1395 0 2.2mb 2.2mb

The `pri` value is now `3` instead of the default `1`. This means there are 3
shards for this index. You can also check the `shards` endpoint:

oc exec -c elasticsearch $espod -- \
curl -s -k --cert /etc/elasticsearch/secret/admin-cert \
--key /etc/elasticsearch/secret/admin-key \
https://localhost:9200/_cat/shards?v
index shard prirep state docs store ip node

project.kube-service-catalog.d5dbe052-903c-11e8-8c22-fa163e6e12b8.2018.07.26 1 p STARTED 596 683.3kb 10.131.0.8 logging-es-data-master-vksc2fwe
project.kube-service-catalog.d5dbe052-903c-11e8-8c22-fa163e6e12b8.2018.07.26 2 p STARTED 590 652.6kb 10.131.0.8 logging-es-data-master-vksc2fwe
project.kube-service-catalog.d5dbe052-903c-11e8-8c22-fa163e6e12b8.2018.07.26 0 p STARTED 602 628.1kb 10.131.0.8 logging-es-data-master-vksc2fwe

This lists the 3 shards for the index. If you have multiple Elasticsearch
nodes, you should see more than one node listed in the `node` column of the
`_cat/shards` output.

0 comments on commit c9109b2

Please sign in to comment.