You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, every version lookup creates a new TermsEnum for each segment in the index, but this is quite costly, e.g. on NIOFSDir it must clone the IO buffer, and because BlockTreeTermsReader has a lot of internal state.
…indexing
The TermsEnums used for lookup have highish cost to init, so if we
reuse them we may be able to stop using bloom filters. I ran some bulk
update performance tests, showing that turning off blooms and reusing
the enums gets close to the same performance as master (using blooms
and not reusing the enums).
Closeselastic#6212
Reusing Lucene's TermsEnum for _uid/version lookups gives a small
indexing (updates) speedup and brings us a closer to not having
to spend RAM on bloom filters.
Closes#6212
jpountz
changed the title
Versions.loadDocIdAndVersion should reuse TermsEnums
Versioning: Versions.loadDocIdAndVersion should reuse TermsEnums
Jun 19, 2014
clintongormley
changed the title
Versioning: Versions.loadDocIdAndVersion should reuse TermsEnums
Indexing: Versions.loadDocIdAndVersion should reuse TermsEnums
Jul 16, 2014
Today, every version lookup creates a new TermsEnum for each segment in the index, but this is quite costly, e.g. on NIOFSDir it must clone the IO buffer, and because BlockTreeTermsReader has a lot of internal state.
We'd need a ThreadLocal somewhere/somehow... I have a start at a utility class here: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5675/lucene/test-framework/src/java/org/apache/lucene/index/PerThreadPKLookup.java maybe we can adapt/use this.
The text was updated successfully, but these errors were encountered: