Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing: Clear versionMap on refresh not flush #6363

Closed
mikemccand opened this issue May 31, 2014 · 1 comment
Closed

Indexing: Clear versionMap on refresh not flush #6363

mikemccand opened this issue May 31, 2014 · 1 comment

Comments

@mikemccand
Copy link
Contributor

Today we clear only on flush (Lucene commit) but this is dangerous because the in-memory versionMap can grow very large e.g. if the translog flush size/ops was increased from the defaults.

@mikemccand
Copy link
Contributor Author

Fixed with #6379

@mikemccand mikemccand added bug and removed enhancement labels Jun 4, 2014
@clintongormley clintongormley changed the title Clear versionMap on refresh not flush Indexing: Clear versionMap on refresh not flush Jul 16, 2014
bleskes added a commit to bleskes/elasticsearch that referenced this issue Mar 10, 2015
…g version map size.

To support real time gets, the engine keeps an in-memory map of recently index docs and their location in the translog. This is needed until documents become visible in Lucene. With 1.3.0, we have improved this map and made tightly integrated with refresh cycles in Lucene in order to keep the memory signature to a bare minimum. On top of that, if the version map grows above a 25% of the index buffer size, we proactively refresh in order to be able to trim the version map back to 0 (see elastic#6363) . In the same release, we have fixed an issue where an update to the indexing buffer could result in an unwanted exception during recovery (elastic#6667) . We solved this by waiting with updating the size of the index buffer until the shard was fully recovered. Sadly this two together can have a negative impact on the speed of translog recovery.

During the second phase of recovery we replay all operations that happened on the shard during the first phase of copying files. In parallel we start indexing new documents into the new created shard. At some point (phase 3 in the recovery), the translog replay starts to send operation which have already been indexed into the shard. The version map is crucial in being able to quickly detect this and skip the replayed operations, without hitting lucene. Sadly elastic#6667 (only updating the index memory buffer once shard is started) means that a shard is using the default 64MB for it's index buffer, and thus only 16MB (25%) for the version map. This much less then the default index buffer size 10% of machine memory (shared between shards).

Since we don't flush anymore when updating the memory buffer, we can remove elastic#6667 and update recovering shards as well. Also, we make the version map max size configurable, with the same default of 25% of the current index buffer.
bleskes added a commit to bleskes/elasticsearch that referenced this issue Mar 15, 2015
…g version map size.

To support real time gets, the engine keeps an in-memory map of recently index docs and their location in the translog. This is needed until documents become visible in Lucene. With 1.3.0, we have improved this map and made tightly integrated with refresh cycles in Lucene in order to keep the memory signature to a bare minimum. On top of that, if the version map grows above a 25% of the index buffer size, we proactively refresh in order to be able to trim the version map back to 0 (see elastic#6363) . In the same release, we have fixed an issue where an update to the indexing buffer could result in an unwanted exception during recovery (elastic#6667) . We solved this by waiting with updating the size of the index buffer until the shard was fully recovered. Sadly this two together can have a negative impact on the speed of translog recovery.

During the second phase of recovery we replay all operations that happened on the shard during the first phase of copying files. In parallel we start indexing new documents into the new created shard. At some point (phase 3 in the recovery), the translog replay starts to send operation which have already been indexed into the shard. The version map is crucial in being able to quickly detect this and skip the replayed operations, without hitting lucene. Sadly elastic#6667 (only updating the index memory buffer once shard is started) means that a shard is using the default 64MB for it's index buffer, and thus only 16MB (25%) for the version map. This much less then the default index buffer size 10% of machine memory (shared between shards).

Since we don't flush anymore when updating the memory buffer, we can remove elastic#6667 and update recovering shards as well. Also, we make the version map max size configurable, with the same default of 25% of the current index buffer.

Closes elastic#10046
bleskes added a commit to bleskes/elasticsearch that referenced this issue Mar 15, 2015
…g version map size.

To support real time gets, the engine keeps an in-memory map of recently index docs and their location in the translog. This is needed until documents become visible in Lucene. With 1.3.0, we have improved this map and made tightly integrated with refresh cycles in Lucene in order to keep the memory signature to a bare minimum. On top of that, if the version map grows above a 25% of the index buffer size, we proactively refresh in order to be able to trim the version map back to 0 (see elastic#6363) . In the same release, we have fixed an issue where an update to the indexing buffer could result in an unwanted exception during recovery (elastic#6667) . We solved this by waiting with updating the size of the index buffer until the shard was fully recovered. Sadly this two together can have a negative impact on the speed of translog recovery.

During the second phase of recovery we replay all operations that happened on the shard during the first phase of copying files. In parallel we start indexing new documents into the new created shard. At some point (phase 3 in the recovery), the translog replay starts to send operation which have already been indexed into the shard. The version map is crucial in being able to quickly detect this and skip the replayed operations, without hitting lucene. Sadly elastic#6667 (only updating the index memory buffer once shard is started) means that a shard is using the default 64MB for it's index buffer, and thus only 16MB (25%) for the version map. This much less then the default index buffer size 10% of machine memory (shared between shards).

Since we don't flush anymore when updating the memory buffer, we can remove elastic#6667 and update recovering shards as well. Also, we make the version map max size configurable, with the same default of 25% of the current index buffer.

Closes elastic#10046
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant