Zulia 5.2.0

Latest

Latest

github-actions released this 23 Jun 16:50

· 1 commit to main since this release

4ace5e9

New Features

Doc-values skip index, enabled by default for new fields and new indexes. Zulia now writes a Lucene doc-values skip index for new fields. This lets range queries, doc-values sorts, and field-exists checks skip over whole blocks of non-matching documents instead of scanning every value, which speeds up these operations on larger indexes. New fields get it automatically. A field can opt out with the new docValueSkipIndex field configuration option (set to false). The built-in document-id sort field gains the skip index in newly created indexes as well. Existing indexes and fields keep working unchanged but do not gain the optimization, because Lucene treats the skip index as part of the immutable field schema, and it cannot be toggled on data that is already written. To get the benefit on existing data, reindex it into a new index.
Two new embedding models. Added BAAI/bge-large-en-v1.5 and Alibaba-NLP/gte-large-en-v1.5 to the set of known embedding models available for vector creation.

Bug Fixes

Concurrent indexing of high-cardinality facets could assign duplicate ordinals to a facet value. The taxonomy writer cache batches the reader refresh that lets evicted categories be rediscovered instead of re-added. During the initial cache fill no refresh had happened yet, so a category evicted before the first batched refresh was missing from both the cache and the reader and received a second ordinal, splitting that value's facet counts. The cache now forces a refresh on the first eviction in each dimension, closing this cold-start gap.

Assets 6