Skip to content

Latest commit

 

History

History
61 lines (34 loc) · 1.98 KB

File metadata and controls

61 lines (34 loc) · 1.98 KB

split

Name

index-split - Splits an index into multiple equal parts.

Synopsis

lucene index split <OUTPUT_DIRECTORY> <INPUT_DIRECTORY>[ <INPUT_DIRECTORY_2>...] [-n|--number-of-parts] [-s|--sequential] [?|-h|--help]

Description

Splits the input index into multiple equal parts. The method employed here uses IndexWriter.AddIndexes(IndexReader[]) where the input data comes from the input index with artificially applied deletes to the document ids that fall outside the selected partition.

Deletes are only applied to a buffered list of deleted documents and don't affect the source index. This tool works also with read-only indexes.

The disadvantage of this tool is that source index needs to be read as many times as there are parts to be created. The multiple passes may be slow.

Note

This tool is unaware of documents added automatically via IndexWriter.AddDocuments(IEnumerable<IEnumerable<IIndexableField>>, Analyzer) or IndexWriter.UpdateDocuments(Term, IEnumerable<IEnumerable<IIndexableField>>, Analyzer), which means it can easily break up such document groups.

Arguments

OUTPUT_DIRECTORY

Path to output directory to contain partial indexes.

INPUT_DIRECTORY, INPUT_DIRECTORY_2

The path of the source index, which can have deletions and can have multiple segments (or multiple readers). Multiple values can be supplied separated by a space.

Options

?|-h|--help

Prints out a short help for the command.

-n|--number-of-parts <NUMBER>

The number of parts (output indices) to produce. If omitted, defaults to 2.

-s|--sequential

Sequential doc-id range split (default is round-robin).

Example

Split the index located at X:\old-index\ sequentially, placing the resulting 2 indices into the X:\new-index\ directory:

lucene index split X:\new-index X:\old-index --sequential

Split the index located at T:\in\ into 4 parts and place them into the T:\out\ directory:

lucene index split T:\out T:\in -n 4