Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let Lucene kick off merges normally #8643

Closed
wants to merge 2 commits into from

Conversation

Projects
None yet
4 participants
@mikemccand
Copy link
Contributor

commented Nov 25, 2014

With the last Lucene 5.0.0 snapshot upgrade, I turned off CMS's hard
stalling of incoming threads when merging can't keep up, since ES does
its own index throttling to deal with this.

This means we can simplify a lot of ES's custom wrapper code for
handling segment merges: remove EnableMergeScheduler, MERGE thread
pool, InternalIndexShard.EngineMerger, and the settings
index.merge.force_async_merge and index.merge.async_interval.

Ie, we should now just let Lucene kick off merges as soon as they are
needed, instead of polling every 1.0 second and doing it ourselves.

@@ -73,17 +72,19 @@
"g",
"i",
"ma",
"m",

This comment has been minimized.

Copy link
@jpountz

jpountz Nov 25, 2014

Contributor

Can you also remove it from the docs (docs/reference/cat/thread_pool.asciidoc)?

"su",
"w"
};

static {
assert SUPPORTED_ALIASES.length == SUPPORTED_NAMES.length: "SUPPORTED_NAMES/ALIASES mismatch";

This comment has been minimized.

Copy link
@jpountz

jpountz Nov 25, 2014

Contributor

wow, good catch. Would be good to backport this part of the patch to 1.x

This comment has been minimized.

Copy link
@mikemccand

mikemccand Nov 25, 2014

Author Contributor

OK will do.

@@ -35,8 +35,8 @@

- match:
$body: |
/^ id \s+ ba \s+ fa \s+ gea \s+ ga \s+ ia \s+ maa \s+ ma \s+ oa \s+ pa \s+ \n
(\S+ \s+ \d+ \s+ \d+ \s+ \d+ \s+ \d+ \s+ \d+ \s+ \d+ \s+ \d+ \s+ \d+ \s+ \d+ \s+ \n)+ $/
/^ id \s+ ba \s+ fa \s+ gea \s+ ga \s+ ia \s+ maa \s+ oa \s+ pa \s+ \n

This comment has been minimized.

Copy link
@jpountz

jpountz Nov 25, 2014

Contributor

Why did you need to remove the management thread pool from this test?

This comment has been minimized.

Copy link
@mikemccand

mikemccand Nov 25, 2014

Author Contributor

Whoa, that's wrong: I was confused (merge pool is "m" not "ma")! Yet, ma is gone from the response in this test ... I'll dig.

This comment has been minimized.

Copy link
@mikemccand

mikemccand Nov 25, 2014

Author Contributor

OK, as confusing as it is, the "ma" in this context I believe is the alias for "merge.active" (and "maa" is the alias for management.active), so I think it's correct to remove "ma" since merge thread pool is now gone. These aliases are confusing, especially when they collide :)

This comment has been minimized.

Copy link
@jpountz

jpountz Nov 25, 2014

Contributor

oh I see, thanks for the explanation

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2014

+1 I just left minor comments/questions.

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2014

LGTM

@mikemccand

This comment has been minimized.

Copy link
Contributor Author

commented Nov 25, 2014

Thanks @jpountz I'll push to master, and backport the "sd" alias fix down to 1.4, 1.x.

mikemccand added a commit that referenced this pull request Nov 25, 2014

Rest: remove old SNAPSHOT_DATA alias (sd)
SNAPSHOT_DATA was removed in #7551 but its alias (sd) was not.

This was fixed in master with #8643.

mikemccand added a commit that referenced this pull request Nov 25, 2014

Rest: remove old SNAPSHOT_DATA alias (sd)
SNAPSHOT_DATA was removed in #7551 but its alias (sd) was not.

This was fixed in master with #8643.

@clintongormley clintongormley removed the review label Mar 19, 2015

@kimchy kimchy added the >enhancement label Apr 7, 2015

mikemccand added a commit that referenced this pull request Apr 7, 2015

Core: Lucene merges must run on target shard during recovery
This does not affect 2.0, where we let Lucene launch merges normally
(#8643).

In 1.x, every 1 sec (default), we ask Lucene to kick off any new
merges, but we unfortunately don't turn that logic on in the target
shard until after recovery has finished.

This means if you have a large translog, and/or a smallish index
buffer, way too many segments can accumulate in the target shard
during recovery, making version lookups slower and slower (OI(N^2))
and possibly causing slow recovery issues like #9226.

This fix changes IndexShard to launch merges as soon as the shard is
created, so merging runs during recovery.

Closes #10463

mikemccand added a commit that referenced this pull request Apr 7, 2015

Core: Lucene merges must run on target shard during recovery
This does not affect 2.0, where we let Lucene launch merges normally
(#8643).

In 1.x, every 1 sec (default), we ask Lucene to kick off any new
merges, but we unfortunately don't turn that logic on in the target
shard until after recovery has finished.

This means if you have a large translog, and/or a smallish index
buffer, way too many segments can accumulate in the target shard
during recovery, making version lookups slower and slower (OI(N^2))
and possibly causing slow recovery issues like #9226.

This fix changes IndexShard to launch merges as soon as the shard is
created, so merging runs during recovery.

Closes #10463

mikemccand added a commit to mikemccand/elasticsearch that referenced this pull request Apr 11, 2015

Core: Lucene merges must run on target shard during recovery
This does not affect 2.0, where we let Lucene launch merges normally
(elastic#8643).

In 1.x, every 1 sec (default), we ask Lucene to kick off any new
merges, but we unfortunately don't turn that logic on in the target
shard until after recovery has finished.

This means if you have a large translog, and/or a smallish index
buffer, way too many segments can accumulate in the target shard
during recovery, making version lookups slower and slower (OI(N^2))
and possibly causing slow recovery issues like elastic#9226.

This fix changes IndexShard to launch merges as soon as the shard is
created, so merging runs during recovery.

Closes elastic#10463

mikemccand added a commit that referenced this pull request Apr 11, 2015

Core: Lucene merges must run on target shard during recovery
This does not affect 2.0, where we let Lucene launch merges normally
(#8643).

In 1.x, every 1 sec (default), we ask Lucene to kick off any new
merges, but we unfortunately don't turn that logic on in the target
shard until after recovery has finished.

This means if you have a large translog, and/or a smallish index
buffer, way too many segments can accumulate in the target shard
during recovery, making version lookups slower and slower (OI(N^2))
and possibly causing slow recovery issues like #9226.

This fix changes IndexShard to launch merges as soon as the shard is
created, so merging runs during recovery.

Closes #10463

@clintongormley clintongormley changed the title Core: let Lucene kick off merges normally Let Lucene kick off merges normally Jun 7, 2015

mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015

Core: Lucene merges must run on target shard during recovery
This does not affect 2.0, where we let Lucene launch merges normally
(elastic#8643).

In 1.x, every 1 sec (default), we ask Lucene to kick off any new
merges, but we unfortunately don't turn that logic on in the target
shard until after recovery has finished.

This means if you have a large translog, and/or a smallish index
buffer, way too many segments can accumulate in the target shard
during recovery, making version lookups slower and slower (OI(N^2))
and possibly causing slow recovery issues like elastic#9226.

This fix changes IndexShard to launch merges as soon as the shard is
created, so merging runs during recovery.

Closes elastic#10463

mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015

Rest: remove old SNAPSHOT_DATA alias (sd)
SNAPSHOT_DATA was removed in elastic#7551 but its alias (sd) was not.

This was fixed in master with elastic#8643.

mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015

Core: Lucene merges must run on target shard during recovery
This does not affect 2.0, where we let Lucene launch merges normally
(elastic#8643).

In 1.x, every 1 sec (default), we ask Lucene to kick off any new
merges, but we unfortunately don't turn that logic on in the target
shard until after recovery has finished.

This means if you have a large translog, and/or a smallish index
buffer, way too many segments can accumulate in the target shard
during recovery, making version lookups slower and slower (OI(N^2))
and possibly causing slow recovery issues like elastic#9226.

This fix changes IndexShard to launch merges as soon as the shard is
created, so merging runs during recovery.

Closes elastic#10463
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.