Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove indicesLifecycle.Listener from IndexingMemoryController #6892

Closed

Conversation

bleskes
Copy link
Contributor

@bleskes bleskes commented Jul 16, 2014

The IndexingMemoryController determines the amount of indexing buffer size and translog buffer size each shard should have. It takes memory from inactive shards (indexing wise) and assigns it to other shards. To do so it needs to know about the addition and closing of shards. The current implementation hooks into the indicesService.indicesLifecycle() mechanism to receive call backs, such shard entered the POST_RECOVERY state. Those call backs are typically run on the thread that actually made the change. A mutex was used to synchronize those callbacks with IndexingMemoryController's background thread, which updates the internal engines memory usage on a regular interval. This introduced a dependency between those threads and the locks of the internal engines hosted on the node. In a very rare situation (two tests runs locally) this can cause recovery time outs where two nodes are recovering replicas from each other.

This commit introduces a a lock free approach that updates the internal data structures during iterations in the background thread.

 The IndexingMemoryController determines the amount of indexing buffer size and translog buffer size each shard should have. It takes memory from inactive shards (indexing wise) and assigns it to other shards. To do so it needs to know about the addition and closing of shards. The current implementation hooks into the indicesService.indicesLifecycle() mechanism to receive call backs, such shard entered the POST_RECOVERY state. Those call backs are typically run on the thread that actually made the change. A mutex was used to synchronize those callbacks with IndexingMemoryController's background thread, which updates the internal engines memory usage on a regular interval. This introduced a dependency between those threads and the locks of the internal engines hosted on the node. In a *very* rare situation (two tests runs locally) this can cause recovery time outs where two nodes are recovering replicas from each other.

 This commit introduces a a lock free approach that updates the internal data structures during iterations in the background thread.
@bleskes bleskes added v2.0.0 and removed v1.4.0 labels Jul 16, 2014
@bleskes
Copy link
Contributor Author

bleskes commented Jul 17, 2014

Tagged as 1.3

@s1monw
Copy link
Contributor

s1monw commented Jul 17, 2014

@bleskes I think this looks good though. I left some comments but on the commit :/ I like the removal of the listeners and the sync stuff

@s1monw s1monw removed the review label Jul 17, 2014
@bleskes
Copy link
Contributor Author

bleskes commented Jul 17, 2014

@s1monw I pushed another update

@bleskes bleskes added the review label Jul 17, 2014
@s1monw
Copy link
Contributor

s1monw commented Jul 17, 2014

NICE LGTM

@s1monw s1monw removed the review label Jul 17, 2014
@bleskes bleskes closed this in 38d8e3c Jul 17, 2014
bleskes added a commit that referenced this pull request Jul 17, 2014
The IndexingMemoryController determines the amount of indexing buffer size and translog buffer size each shard should have. It takes memory from inactive shards (indexing wise) and assigns it to other shards. To do so it needs to know about the addition and closing of shards. The current implementation hooks into the indicesService.indicesLifecycle() mechanism to receive call backs, such shard entered the POST_RECOVERY state. Those call backs are typically run on the thread that actually made the change. A mutex was used to synchronize those callbacks with IndexingMemoryController's background thread, which updates the internal engines memory usage on a regular interval. This introduced a dependency between those threads and the locks of the internal engines hosted on the node. In a *very* rare situation (two tests runs locally) this can cause recovery time outs where two nodes are recovering replicas from each other.

 This commit introduces a a lock free approach that updates the internal data structures during iterations in the background thread.

Closes #6892
bleskes added a commit that referenced this pull request Jul 17, 2014
The IndexingMemoryController determines the amount of indexing buffer size and translog buffer size each shard should have. It takes memory from inactive shards (indexing wise) and assigns it to other shards. To do so it needs to know about the addition and closing of shards. The current implementation hooks into the indicesService.indicesLifecycle() mechanism to receive call backs, such shard entered the POST_RECOVERY state. Those call backs are typically run on the thread that actually made the change. A mutex was used to synchronize those callbacks with IndexingMemoryController's background thread, which updates the internal engines memory usage on a regular interval. This introduced a dependency between those threads and the locks of the internal engines hosted on the node. In a *very* rare situation (two tests runs locally) this can cause recovery time outs where two nodes are recovering replicas from each other.

 This commit introduces a a lock free approach that updates the internal data structures during iterations in the background thread.

Closes #6892
@bleskes bleskes deleted the indexing_memory_controller_decoupling branch July 17, 2014 12:45
@clintongormley clintongormley changed the title [Infra] remove indicesLifecycle.Listener from IndexingMemoryController Remove indicesLifecycle.Listener from IndexingMemoryController Jun 7, 2015
mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
The IndexingMemoryController determines the amount of indexing buffer size and translog buffer size each shard should have. It takes memory from inactive shards (indexing wise) and assigns it to other shards. To do so it needs to know about the addition and closing of shards. The current implementation hooks into the indicesService.indicesLifecycle() mechanism to receive call backs, such shard entered the POST_RECOVERY state. Those call backs are typically run on the thread that actually made the change. A mutex was used to synchronize those callbacks with IndexingMemoryController's background thread, which updates the internal engines memory usage on a regular interval. This introduced a dependency between those threads and the locks of the internal engines hosted on the node. In a *very* rare situation (two tests runs locally) this can cause recovery time outs where two nodes are recovering replicas from each other.

 This commit introduces a a lock free approach that updates the internal data structures during iterations in the background thread.

Closes elastic#6892
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants