-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
In order to offer user's near realtime search, without incurring
an indexing performance penalty, we can implement search on
IndexWriter's RAM buffer. This is the buffer that is filled in
RAM as documents are indexed. Currently the RAM buffer is
flushed to the underlying directory (usually disk) before being
made searchable.
Todays Lucene based NRT systems must incur the cost of merging
segments, which can slow indexing.
Michael Busch has good suggestions regarding how to handle deletes using max doc ids.
https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923
The area that isn't fully fleshed out is the terms dictionary,
which needs to be sorted prior to queries executing. Currently
IW implements a specialized hash table. Michael B has a
suggestion here:
https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915
Migrated from LUCENE-2312 by Jason Rutherglen, updated Sep 09 2011
Attachments: LUCENE-2312.patch (versions: 3), LUCENE-2312-FC.patch
Linked issues:
- Per thread DocumentsWriters that write their own private segments [LUCENE-2324] #3400
- Concurrent byte and int block implementations [LUCENE-2575] #3649
- Add non-desctructive sort to BytesRefHash [LUCENE-3199] #4272
- Enable replace-able field caches [LUCENE-3399] #4472
- Explore other in-memory postinglist formats for realtime search [LUCENE-2346] #3422
- CASSANDRA-2915