Throttling incoming indexing when Lucene merges fall behind #6066

Closed
mikemccand opened this Issue May 6, 2014 · 15 comments

Comments

Projects
None yet
5 participants
@mikemccand
Contributor

mikemccand commented May 6, 2014

Lucene has low-level protection that blocks incoming segment-producing threads (indexing threads, NRT reopen threads, commit, etc.) when there are too many merges running.

But this is too harsh for Elasticsearch, so it's entirely disabled, but this means merges can fall far behind under heavy indexing, and this results in too many segments in the index, which causes all sorts of problems (slow version lookups, too much RAM, etc.).

So we need to do something "softer"; Simon has a good starting patch, which I tested and confirmed (after https://issues.apache.org/jira/browse/LUCENE-5644 is fixed) at least in one use-case that it prevents too many segments in the index:

Before Simon's + Lucene's fix: http://people.apache.org/~mikemccand/lucenebench/base.html

Same test with the fix: http://people.apache.org/~mikemccand/lucenebench/throttled.html

Segment counts stay essentially flat.

Here's Simon's prototype patch: s1monw@2de96f9

@nik9000

This comment has been minimized.

Show comment
Hide comment
@nik9000

nik9000 May 6, 2014

Contributor

It looks like Simon's prototype pauses the indexing thread if too many merges are in flight. I'm not 100% clear on the code path that gets here. Will that pause indexing or pause refreshing or both? It'd be neat to slow down just the refreshing and let indexing be slowed down by the refresh backlog logic. Or am I crazy?

Contributor

nik9000 commented May 6, 2014

It looks like Simon's prototype pauses the indexing thread if too many merges are in flight. I'm not 100% clear on the code path that gets here. Will that pause indexing or pause refreshing or both? It'd be neat to slow down just the refreshing and let indexing be slowed down by the refresh backlog logic. Or am I crazy?

@mikemccand mikemccand removed the non-issue label May 6, 2014

@s1monw

This comment has been minimized.

Show comment
Hide comment
@s1monw

s1monw May 6, 2014

Contributor

@nik9000 internally the IndexWriter has several threads states (8 by default) that we index into. If we limit to a single threads we only use on of the states and make sure we max out the RAM buffer and write the least amount of segments. This means we 1. reduce the number of segments to merge and 2. make sure flushes are only done if really needed. I think we can't slow down refreshes otherwise folks will see odd results since you don't get new documents. You also want to refresh to publish merged segments to further reduce the number of segments. We will do the right thing and provide backpressure on indexing not on refresh. Hope that makes sense?

Contributor

s1monw commented May 6, 2014

@nik9000 internally the IndexWriter has several threads states (8 by default) that we index into. If we limit to a single threads we only use on of the states and make sure we max out the RAM buffer and write the least amount of segments. This means we 1. reduce the number of segments to merge and 2. make sure flushes are only done if really needed. I think we can't slow down refreshes otherwise folks will see odd results since you don't get new documents. You also want to refresh to publish merged segments to further reduce the number of segments. We will do the right thing and provide backpressure on indexing not on refresh. Hope that makes sense?

@nik9000

This comment has been minimized.

Show comment
Hide comment
@nik9000

nik9000 May 6, 2014

Contributor

We will do the right thing and provide backpressure on indexing not on refresh. Hope that makes sense?

I'd honestly forgotten about flushes. Its what I get for only playing on the other side. Anyway, I'm happy so long as back pressure is provided on indexing.

Contributor

nik9000 commented May 6, 2014

We will do the right thing and provide backpressure on indexing not on refresh. Hope that makes sense?

I'd honestly forgotten about flushes. Its what I get for only playing on the other side. Anyway, I'm happy so long as back pressure is provided on indexing.

@mikemccand

This comment has been minimized.

Show comment
Hide comment
@mikemccand

mikemccand May 9, 2014

Contributor

I tested the current throttling branch with the refresh=-1 case, and we have problems because the "abandoned" thread states will never flush until a full flush ... workaround is you must use a refresh to get them flushed.

Contributor

mikemccand commented May 9, 2014

I tested the current throttling branch with the refresh=-1 case, and we have problems because the "abandoned" thread states will never flush until a full flush ... workaround is you must use a refresh to get them flushed.

@mikemccand

This comment has been minimized.

Show comment
Hide comment
@mikemccand

mikemccand May 9, 2014

Contributor

I'm inclined to simply document that index throttling won't kick in if you use SerialMergeScheduler.

SMS only allows one merge to run at a time, so apps that are doing heavy bulk indexing really should not be using it.

Contributor

mikemccand commented May 9, 2014

I'm inclined to simply document that index throttling won't kick in if you use SerialMergeScheduler.

SMS only allows one merge to run at a time, so apps that are doing heavy bulk indexing really should not be using it.

@mikemccand

This comment has been minimized.

Show comment
Hide comment
@mikemccand

mikemccand May 13, 2014

Contributor

OK I reviewed these changes with Simon. We decided we don't need to add a separate "kill switch" for this because you can just set max_merge_count higher to avoid throttling. But we also decided not to document this new setting on the index-modules-merges docs: it's a very advanced setting, and playing with it could easily mess up merges.

Contributor

mikemccand commented May 13, 2014

OK I reviewed these changes with Simon. We decided we don't need to add a separate "kill switch" for this because you can just set max_merge_count higher to avoid throttling. But we also decided not to document this new setting on the index-modules-merges docs: it's a very advanced setting, and playing with it could easily mess up merges.

s1monw added a commit that referenced this issue May 17, 2014

Upgrade to Lucene 4.8.1
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:

 * An IndexThrottle now kicks in when merges start falling behind
   limiting index threads to 1 until merges caught up. Closes #6066
 * RateLimiter now kicks in at the configured rate where previously
   the limiter was limiting at ~8MB/sec almost all the time. Closes #6018

@s1monw s1monw added the enhancement label May 18, 2014

@s1monw s1monw closed this in 85a0b76 May 19, 2014

s1monw added a commit that referenced this issue May 19, 2014

Upgrade to Lucene 4.8.1
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:

 * An IndexThrottle now kicks in when merges start falling behind
   limiting index threads to 1 until merges caught up. Closes #6066
 * RateLimiter now kicks in at the configured rate where previously
   the limiter was limiting at ~8MB/sec almost all the time. Closes #6018

s1monw added a commit that referenced this issue May 19, 2014

Upgrade to Lucene 4.8.1
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:

 * An IndexThrottle now kicks in when merges start falling behind
   limiting index threads to 1 until merges caught up. Closes #6066
 * RateLimiter now kicks in at the configured rate where previously
   the limiter was limiting at ~8MB/sec almost all the time. Closes #6018
@l15k4

This comment has been minimized.

Show comment
Hide comment
@l15k4

l15k4 Oct 21, 2015

Guys I don't think this works as expected, I'm getting :

now throttling indexing: numMergesInFlight=4, maxNumMerges=3
stop throttling indexing: numMergesInFlight=2, maxNumMerges=3

5 times a second right at the beginning of bulk indexing. I'm disabling throttling and refresh interval, I start with optimized index, waiting until segment merging finishes, but segment merging is still falling behind... Why is the indexing throttling starting and stopping so frequently?

l15k4 commented Oct 21, 2015

Guys I don't think this works as expected, I'm getting :

now throttling indexing: numMergesInFlight=4, maxNumMerges=3
stop throttling indexing: numMergesInFlight=2, maxNumMerges=3

5 times a second right at the beginning of bulk indexing. I'm disabling throttling and refresh interval, I start with optimized index, waiting until segment merging finishes, but segment merging is still falling behind... Why is the indexing throttling starting and stopping so frequently?

@l15k4

This comment has been minimized.

Show comment
Hide comment
@l15k4

l15k4 Oct 23, 2015

@clintongormley but they are not not keeping up right at the moment of starting indexing intto a a small (1M records) optimized index... increasing merge thread pool doesn't help...

I have 4 ec2.xlarge instances clustered with 1B records in 30 indices (5 shards each). And if I create a new index and start bulk index into it then throttling happens right away. All fields are doc_values and I think that it happened right I after I reindexed everything to doc_values around 1.6.0 .... I cannot shake it off since then... I tried everything...

Imho I need to scale it up just because of segment merging, but there will be plenty of unused resources ...

I'm trying to solve this issue for months now...

l15k4 commented Oct 23, 2015

@clintongormley but they are not not keeping up right at the moment of starting indexing intto a a small (1M records) optimized index... increasing merge thread pool doesn't help...

I have 4 ec2.xlarge instances clustered with 1B records in 30 indices (5 shards each). And if I create a new index and start bulk index into it then throttling happens right away. All fields are doc_values and I think that it happened right I after I reindexed everything to doc_values around 1.6.0 .... I cannot shake it off since then... I tried everything...

Imho I need to scale it up just because of segment merging, but there will be plenty of unused resources ...

I'm trying to solve this issue for months now...

@mikemccand

This comment has been minimized.

Show comment
Hide comment
@mikemccand

mikemccand Oct 24, 2015

Contributor

@l15k4 did you disable store IO throttling (defaults to 20 MB/sec, which is too low for heavy indexing cases).

Where are you storing the shards (what IO devices), EBS or local instance storage?

Also try the ideas here: https://www.elastic.co/blog/performance-considerations-elasticsearch-indexing

Contributor

mikemccand commented Oct 24, 2015

@l15k4 did you disable store IO throttling (defaults to 20 MB/sec, which is too low for heavy indexing cases).

Where are you storing the shards (what IO devices), EBS or local instance storage?

Also try the ideas here: https://www.elastic.co/blog/performance-considerations-elasticsearch-indexing

@l15k4

This comment has been minimized.

Show comment
Hide comment
@l15k4

l15k4 Oct 25, 2015

@mikemccand I set it up to 30, 40, 80, 100 MB/s ... it had no effect. I also tried to set index.merge.scheduler.max_thread_count: 6 but it lead to throttling now throttling indexing: numMergesInFlight=9 so it didn't help either...

We use EBS (General Purpose (SSD)) on c4.xlarge instances. 2 volumes, one for system and one dedicated for ES...

It seems that if you are doing bulk indexing and have all fields doc_values then you need either quad core machine or physically attached SSD or both, otherwise segment merging will always fall behind no matter what optimizations one does...

It always looks this way, it is throttling for some period of time like 15-20 minutes and then it stops http://i.imgur.com/UyDTlHi.png

I also tried to shrink index and bulk threadpools for segment merging to keep up with bulk indexing, but it didn't help either ... it doesn't keep up right when the first few bulk index requests come...

l15k4 commented Oct 25, 2015

@mikemccand I set it up to 30, 40, 80, 100 MB/s ... it had no effect. I also tried to set index.merge.scheduler.max_thread_count: 6 but it lead to throttling now throttling indexing: numMergesInFlight=9 so it didn't help either...

We use EBS (General Purpose (SSD)) on c4.xlarge instances. 2 volumes, one for system and one dedicated for ES...

It seems that if you are doing bulk indexing and have all fields doc_values then you need either quad core machine or physically attached SSD or both, otherwise segment merging will always fall behind no matter what optimizations one does...

It always looks this way, it is throttling for some period of time like 15-20 minutes and then it stops http://i.imgur.com/UyDTlHi.png

I also tried to shrink index and bulk threadpools for segment merging to keep up with bulk indexing, but it didn't help either ... it doesn't keep up right when the first few bulk index requests come...

@l15k4

This comment has been minimized.

Show comment
Hide comment
@l15k4

l15k4 Oct 25, 2015

The best bulk indexing performance I can get on a machine with 4 hyper threads and EBS (750 Mbps) with all fields being doc_values is by increasing index.merge.scheduler.max_thread_count to 6 and decreasing threadpool.bulk.size: 2, this way it is throttling now throttling indexing like every 6th second but it is still throttling so the throughput is now http://i.imgur.com/wXCNZh7.png

I think that after doc_values people don't have much of a choice, they'll need physically attached SSD...

l15k4 commented Oct 25, 2015

The best bulk indexing performance I can get on a machine with 4 hyper threads and EBS (750 Mbps) with all fields being doc_values is by increasing index.merge.scheduler.max_thread_count to 6 and decreasing threadpool.bulk.size: 2, this way it is throttling now throttling indexing like every 6th second but it is still throttling so the throughput is now http://i.imgur.com/wXCNZh7.png

I think that after doc_values people don't have much of a choice, they'll need physically attached SSD...

@mikemccand

This comment has been minimized.

Show comment
Hide comment
@mikemccand

mikemccand Oct 26, 2015

Contributor

Hmm enabling doc values is typically a minor indexing performance hit in my experience, e.g. see the nightly benchmarks at https://benchmarks.elastic.co (annotation R on the first chart).

Do you have provisioned IOPs for your EBS mounts? Are you sure you're not running into that limit?

Can you try the local instance SSD, just for comparison? Your EBS is backed by SSD as well, so this would let us remove EBS from the equation. (You'd need to switch to an i2.4xlarge instance for this test).

Contributor

mikemccand commented Oct 26, 2015

Hmm enabling doc values is typically a minor indexing performance hit in my experience, e.g. see the nightly benchmarks at https://benchmarks.elastic.co (annotation R on the first chart).

Do you have provisioned IOPs for your EBS mounts? Are you sure you're not running into that limit?

Can you try the local instance SSD, just for comparison? Your EBS is backed by SSD as well, so this would let us remove EBS from the equation. (You'd need to switch to an i2.4xlarge instance for this test).

@l15k4

This comment has been minimized.

Show comment
Hide comment
@l15k4

l15k4 Oct 26, 2015

General Purpose unfortunately, the price of IO Provisioned SSDs surprised us. If you want to go beyond 160 MiB/s to 320 MiB/s it costs double than the volume itself.

I guess it wouldn't throttle with IO Provisioned SSD with 9000 IOPS to reach those 320MiB/s ... but these machines cost fortune :-)

l15k4 commented Oct 26, 2015

General Purpose unfortunately, the price of IO Provisioned SSDs surprised us. If you want to go beyond 160 MiB/s to 320 MiB/s it costs double than the volume itself.

I guess it wouldn't throttle with IO Provisioned SSD with 9000 IOPS to reach those 320MiB/s ... but these machines cost fortune :-)

@mikemccand

This comment has been minimized.

Show comment
Hide comment
@mikemccand

mikemccand Oct 26, 2015

Contributor

I guess it wouldn't throttle with IO Provisioned SSD with 9000 IOPS to reach those 320MiB/s ... but these machines cost fortune :-)

Or just use the local instance attached SSDs on the i2.* instance types ...

Contributor

mikemccand commented Oct 26, 2015

I guess it wouldn't throttle with IO Provisioned SSD with 9000 IOPS to reach those 320MiB/s ... but these machines cost fortune :-)

Or just use the local instance attached SSDs on the i2.* instance types ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment