Skip to content

Consolidate IVF and HNSW vector indexes into segment columns.psf (opt-in via storeInSegmentFile)#18852

Open
raghavyadav01 wants to merge 17 commits into
apache:masterfrom
raghavyadav01:vector-readers-pinot-buffer
Open

Consolidate IVF and HNSW vector indexes into segment columns.psf (opt-in via storeInSegmentFile)#18852
raghavyadav01 wants to merge 17 commits into
apache:masterfrom
raghavyadav01:vector-readers-pinot-buffer

Conversation

@raghavyadav01

@raghavyadav01 raghavyadav01 commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Motivation

Today a vector index is written as a per-column sidecar in the segment directory: a flat
.vector.ivfflat.index / .vector.ivfpq.index file for IVF, or a Lucene HNSW directory for
HNSW (plus the lazily-built .vector.hnsw.mapping file). Every vector column adds extra
inodes per segment, and on large clusters with many vector columns the file count balloons.

The text index already solved this by packing its Lucene directory into columns.psf (the
segment's single-file index buffer) via the V2→V3 converter. This PR adds the same pattern
to IVF and HNSW vector indexes — opt-in via a new storeInSegmentFile flag on
VectorIndexConfig, default false so behaviour is unchanged.

End-to-end verification (against MinIO via an S3-tier offline build, same vector data
indexed twice — once with the flag, once without):

Form Files per segment on S3 Cold-mmap query HNSW prune (numDocsScanned/total)
Sidecar (storeInSegmentFile=false) 9 ~22 ms 10 / 400
Consolidated (storeInSegmentFile=true) 4 ~6 ms 10 / 400

The 5 Lucene HNSW files (_0.cfe, _0.cfs, _0.si, segments_1, write.lock) and the
mapping file are gone in the consolidated form — those bytes are a typed entry inside
columns.psf (verified via index_map: embedding.vector_index.startOffset=48, size=12034).

What changes

IVF (flat file)

  • Readers consume PinotDataBuffer. IvfFlatVectorIndexReader, IvfOnDiskVectorIndexReader,
    and IvfPqVectorIndexReader now read from a PinotDataBuffer slice. New IvfCombinedBuffers
    helper opens the buffer for both the consolidated path (slice owned by the segment directory)
    and the legacy path (mmap of the sidecar). Readers honour an ownsBuffer flag so borrowed
    slices (columns.psf views) are not closed by the reader's close().
  • Build path. IvfFlatVectorIndexCreator, IvfPqVectorIndexCreator write a transient
    .combined.index file when the flag is on. The V2→V3 converter picks it up via the standard
    copyIndexIfExists loop and packs it into columns.psf.

HNSW (Lucene directory)

  • Build-time consolidation via new HnswVectorIndexCombined: packs the Lucene HNSW
    directory's files plus the docId mapping (when present) into a single
    .vector.hnsw.combined.index file using the same LUCENE_V2 layout the text index uses.
    Reuses LuceneCombinedTextIndexConstants — no duplicate format definition.
  • Read path via new HnswVectorIndexBufferReader: parses the combined buffer and returns
    a Lucene Directory built on PinotBufferLuceneDirectory + PinotBufferIndexInput (reused
    from the text-index package — KnnVectorsReader uses the same Directory interface, so no
    HNSW-specific Directory adapter is needed).
  • HnswVectorIndexReader gains a PinotDataBuffer-based constructor. The DocIdTranslator
    reads its mapping from inside the combined buffer when available, or rebuilds it in heap
    when not.

Shared

  • Bidirectional migration in VectorIndexHandler: forward (sidecar → columns.psf) and
    reverse (columns.psf → sidecar) on a flag toggle, without rebuilding the index. For HNSW
    the absorb step first packs the Lucene directory into a transient combined file; the extract
    step unpacks the combined entry back into a Lucene directory.
  • Crash recovery. absorbCombinedIntoColumnsPsf detects a previously-crashed absorb via
    hasIndexFor + size check on the typed entry — no exception-message matching. Size mismatch
    refuses with IOException rather than guessing which copy is authoritative.
  • Orphan cleanup. A crash with storeInSegmentFile=true followed by a retry with the
    flag off leaves a stale combined file + .inprogress marker; the next build sweeps both
    before starting. Symmetric in the other direction. Works for both IVF (flat orphan) and HNSW
    (legacy Lucene directory orphan, recursive delete).
  • HNSW extract ordering. Unpack into Lucene directory runs BEFORE removeIndex. A crash
    mid-unpack leaves the typed entry in columns.psf and the next handler run retries — no
    permanent loss.
  • Mixed-state convert. If a V1/V2 segment somehow has both a legacy sidecar and a
    combined file for the same column, the V2→V3 converter packs the combined file and drops
    the legacy sibling — combined wins.

SPI / utility plumbing

  • V1Constants.Indexes gains VECTOR_IVF_FLAT_COMBINED_INDEX_FILE_EXTENSION,
    VECTOR_IVF_PQ_COMBINED_INDEX_FILE_EXTENSION, VECTOR_HNSW_COMBINED_INDEX_FILE_EXTENSION.
  • SegmentDirectoryPaths.findVectorIndexIndexFile probes both legacy and combined
    extensions for IVF and HNSW. Legacy wins on tie (stable behaviour for unchanged segments).
  • VectorIndexUtils.hasCombinedFormVectorIndex is a new predicate the converter uses;
    hasVectorIndex deliberately excludes the combined form so the converter knows to pack it
    rather than preserve it as a sibling.
  • SingleFileIndexDirectory symmetry fixes in hasIndexFor, getColumnsWithIndex,
    removeIndex — vector entries are now treated uniformly whether they live as a sidecar or
    as a _columnEntries slot.

Backward compatibility

  • Default behaviour unchanged. storeInSegmentFile=false preserves the legacy sidecar /
    Lucene-directory layout exactly. Existing segments load and serve identically.
  • No segment-format or wire-protocol break. The .combined.index extensions are build-time
    transient and never appear in a V3 segment after the converter runs.
  • Mixed-version safe: a server running this code reads a segment built without the flag
    exactly as before; the new code only fires when the flag is set in the table config.

Tests

  • 132 tests green (130 unit + 2 integration). New classes:
    • IvfFlatConsolidatedVectorTest — end-to-end integration: build with flag=true, load,
      run vectorSimilarity query, including a MAP column to exercise the
      S3A read path.
    • IvfDiskFormatInspectionTest — byte-level header dumps validate creator output
      matches the documented IVF format unchanged in both extensions.
    • IvfReaderFailurePathTest — mmap-count tracking proves borrowed buffers are not
      closed by the reader (ownsBuffer=false) and that constructor failures still release
      owned buffers.
    • HnswVectorIndexCombinedTest — pack/unpack round-trip plus an end-to-end HNSW build
      read via the buffer-backed Directory.
    • VectorIndexHandlerTest — bidirectional migration for both IVF and HNSW,
      crash-recovery, size-mismatch refusal, extract-refusal-when-sidecar-exists, orphan
      sweep in both directions.
  • Spotless / checkstyle / license clean.

Related

The buffer-backed Lucene Directory plumbing (PinotBufferLuceneDirectory,
PinotBufferIndexInput, LuceneCombinedTextIndexConstants) is reused as-is from the
text-index implementation — this PR adds zero new code in those generic helpers.

Why this matters for tiered storage

End-to-end verification on an S3-tier setup (MinIO; default preload-enabled mode) showed
that the EXPLAIN PLAN for an HNSW vector query differs between the two forms:

Form EXPLAIN PLAN filter operator HNSW used?
storeInSegmentFile=true (this PR, consolidated) VECTOR_SIMILARITY_INDEX (ANN_TOP_K) Yes
storeInSegmentFile=false (legacy sidecar) VECTOR_SIMILARITY_EXACT_SCAN (fallbackReason:vector_index_missing) No — exact-scan fallback

The sidecar Lucene HNSW directory is uploaded to S3 by the tier writer
(recursive segment-directory copy), but in tiered mode the legacy
HnswVectorIndexReader(File segmentDir, ...) cannot locate it: its segmentDir derives from
the segment directory's getPath(), which on S3-backed segment directories returns the S3
URI path component rather than a local filesystem path. The reader gracefully degrades to
exact scan and the system keeps returning correct results, just without the HNSW index
acceleration. With this PR's consolidated form, the bytes live as a typed entry inside
columns.psf and are reachable via segmentReader.getIndexFor(...), which works in tiered
mode without special directory materialization.

@codecov-commenter

codecov-commenter commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 68.07229% with 212 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.82%. Comparing base (19897c5) to head (df17108).

Files with missing lines Patch % Lines
...index/loader/invertedindex/VectorIndexHandler.java 60.41% 57 Missing and 19 partials ⚠️
...nt/index/readers/vector/HnswVectorIndexReader.java 8.57% 30 Missing and 2 partials ⚠️
.../impl/vector/lucene99/HnswVectorIndexCombined.java 86.95% 9 Missing and 12 partials ⚠️
...nt/local/segment/index/vector/VectorIndexType.java 45.71% 16 Missing and 3 partials ⚠️
...local/segment/index/vector/IvfCombinedBuffers.java 25.00% 8 Missing and 10 partials ⚠️
...ndex/converter/SegmentV1V2ToV3FormatConverter.java 50.00% 5 Missing and 5 partials ⚠️
.../segment/local/segment/store/VectorIndexUtils.java 83.33% 4 Missing and 4 partials ⚠️
...dex/readers/vector/IvfOnDiskVectorIndexReader.java 84.61% 3 Missing and 3 partials ⚠️
...t/index/readers/vector/IvfPqVectorIndexReader.java 71.42% 1 Missing and 3 partials ⚠️
...t/local/segment/index/vector/IvfPqIndexFormat.java 77.77% 3 Missing and 1 partial ⚠️
... and 5 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18852      +/-   ##
============================================
+ Coverage     64.80%   64.82%   +0.02%     
  Complexity     1347     1347              
============================================
  Files          3392     3395       +3     
  Lines        211769   212286     +517     
  Branches      33345    33453     +108     
============================================
+ Hits         137232   137610     +378     
- Misses        63451    63534      +83     
- Partials      11086    11142      +56     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.82% <68.07%> (+0.02%) ⬆️
temurin 64.82% <68.07%> (+0.02%) ⬆️
unittests 64.82% <68.07%> (+0.02%) ⬆️
unittests1 56.77% <4.66%> (-0.18%) ⬇️
unittests2 37.26% <66.71%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 added vector Related to vector similarity search index Related to indexing (general) labels Jun 25, 2026
@xiangfu0 xiangfu0 requested review from Copilot and xiangfu0 June 25, 2026 23:56

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an opt-in (VectorIndexConfig.storeInSegmentFile, default false) path to consolidate IVF and HNSW vector index artifacts into the V3 segment combined index file (columns.psf) as typed entries, reducing per-segment file count and improving tiered-storage readability.

Changes:

  • Introduces combined-form vector index extensions and probing logic for IVF/HNSW, plus converter/cleanup support for transient combined artifacts.
  • Refactors IVF readers to consume PinotDataBuffer (owned vs borrowed) and adds shared buffer-mapping utilities; adds HNSW pack/unpack + buffer-backed Lucene Directory reader.
  • Extends VectorIndexHandler and SingleFileIndexDirectory behaviors to support bidirectional migration and unified cleanup for vector index forms.

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/V1Constants.java Adds combined-form vector index extensions.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/store/SegmentDirectoryPaths.java Finds legacy vs combined IVF/HNSW artifacts.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/creator/VectorIndexConfig.java Adds storeInSegmentFile config plumbing.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/VectorIndexUtils.java Combined-form detection + cleanup + extension selection.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/SingleFileIndexDirectory.java Treats vector cleanup/visibility more uniformly.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/VectorIndexType.java Chooses reader path based on consolidation flag.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/IvfFlatVectorIndexCreator.java Writes IVF_FLAT to legacy vs combined extension.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/IvfPqVectorIndexCreator.java Writes IVF_PQ to legacy vs combined extension.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/IvfPqIndexFormat.java Adds buffer-backed read path for IVF_PQ.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/IvfCombinedBuffers.java Centralizes IVF sidecar mapping + close helper.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfFlatVectorIndexReader.java Buffer-backed IVF_FLAT reader + ownership handling.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfPqVectorIndexReader.java Buffer-backed IVF_PQ reader + ownership handling.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfOnDiskVectorIndexReader.java Switches IVF_ON_DISK to positional buffer reads.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/HnswVectorIndexReader.java Adds buffer-backed HNSW reader constructor + mapping modes.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/HnswVectorIndexBufferReader.java Builds Lucene Directory from combined buffer.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/invertedindex/VectorIndexHandler.java Bidirectional migration + crash/orphan handling.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/converter/SegmentV1V2ToV3FormatConverter.java Packs combined-form vector artifacts during conversion.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/vector/HnswVectorIndexCreator.java Packs Lucene dir into combined file on close.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/vector/lucene99/HnswVectorIndexCombined.java Implements HNSW combine/extract (LUCENE_V2).
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/store/VectorIndexUtilsTest.java Tests combined-form detection + extension selection.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/store/SegmentDirectoryPathsTest.java Tests lookup behavior for combined/legacy.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/vector/VectorSearchBenchmark.java Updates benchmark to use buffer-mapped IVF reader.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/vector/IvfFlatVectorIndexTest.java Updates tests to use buffer-backed IVF reader.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/vector/IvfPqVectorIndexTest.java Updates tests to use buffer-backed IVF_PQ reader.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfFlatFilterAwareTest.java Updates filter-aware tests to buffer-backed reader.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfOnDiskFilterAwareTest.java Updates filter-aware tests to buffer-backed reader.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/loader/invertedindex/VectorIndexHandlerTest.java Adds migration/crash/orphan coverage for handler.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/vector/IvfReaderFailurePathTest.java New tests for ownership/leak behavior on failure/borrow.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/vector/IvfDiskFormatInspectionTest.java New byte-level format inspection tests.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/creator/impl/vector/lucene99/HnswVectorIndexCombinedTest.java New HNSW pack/unpack + buffer-backed Directory tests.
pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkVectorIndex.java Uses IvfCombinedBuffers for IVF reader opens.
pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkVectorFilterWorkloads.java Uses buffer-mapped IVF readers in perf workloads.
pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkVectorFeatureWorkloads.java Uses buffer-mapped IVF readers in perf workloads.
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/custom/IvfFlatConsolidatedVectorTest.java New e2e test for consolidated IVF_FLAT path.

@xiangfu0 xiangfu0 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one high-signal issue; see inline comment.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 35 changed files in this pull request and generated 7 comments.

@xiangfu0 xiangfu0 force-pushed the vector-readers-pinot-buffer branch from e80f4bb to d7b7bf1 Compare June 29, 2026 23:04

@xiangfu0 xiangfu0 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few high-signal issues; see inline comments.

raghavyadav01 and others added 13 commits July 4, 2026 14:36
IVF readers previously took a File indexDir and resolved the sidecar
themselves. This change converts all three readers (IVF_FLAT, IVF_PQ,
IVF_ON_DISK) to accept a PinotDataBuffer at construction time so the
storage tier (local mmap, future tiered storage, future external-table
sidecar slice) lives entirely behind the buffer.

Highlights:
- Reader signature: (column, File dir, VectorIndexConfig) ->
  (column, PinotDataBuffer buffer, VectorIndexConfig).
- IvfPqIndexFormat.read() gains a PinotDataBuffer overload that shares
  parsing with the legacy file path via a private readFromStream.
- IvfOnDiskVectorIndexReader drops RandomAccessFile + FileChannel in
  favour of positional PinotDataBuffer.getInt/getFloat/getLong/copyTo.
- New IvfSidecarBuffers helper centralises sidecar resolution,
  BIG_ENDIAN mmap, and a closeQuietly used by all three readers for
  constructor-failure cleanup.
- VectorIndexType.ReaderFactory routes through IvfSidecarBuffers; HNSW
  unchanged (continues using its Lucene sidecar directory).

On-disk format unchanged. Verified byte-by-byte by a new
IvfDiskFormatInspectionTest that builds segments via the unchanged
creators, dumps the raw bytes, and asserts the layout matches the
documented headers (magic 0x49564646 BE for IVF_FLAT/ON_DISK,
'IVPQ' + LE version 2 for IVF_PQ).

Resource safety: reader constructors take buffer ownership and close
on failure via IvfSidecarBuffers.closeQuietly so the mmap doesn't leak
when the caller never sees a reader. New IvfReaderFailurePathTest
proves this by tracking PinotDataBuffer.getMmapBufferCount() across
success and failure paths for all three readers.

Tests: 75 pre-existing IVF tests still pass, plus 3 new disk-format
inspection tests and 6 new failure-path tests. Broader
pinot-segment-local suite (4778 tests) also passes.
Builds on the previous reader-on-PinotBuffer commit. Adds end-to-end
consolidation: when a column declares storeInSegmentFile=true in its
VectorIndexConfig.properties, the IVF payload is absorbed into the
segment's combined index file (columns.psf) as a typed entry instead
of remaining as a sibling sidecar file. Mirrors the text-index
storeInSegmentFile pattern.

Flow:

- Offline segment build still writes the IVF sidecar via the existing
  creator path (no change to the creator).
- On first server load, VectorIndexHandler detects sidecar + flag-on
  + V3 segment, calls LoaderUtils.writeIndexToV3Format to copy the
  bytes into columns.psf, and deletes the sibling.
- Subsequent reads go through VectorIndexType.ReaderFactory, which
  branches on the flag: ON reads via segmentReader.getIndexFor(col,
  StandardIndexes.vector()); OFF reads the legacy sidecar mmap.
- Flag defaults to false; existing segments keep their sidecar
  layout. Active migration of segments built before the flag was
  enabled is deferred to a follow-up.

Changes:

- VectorIndexConfig: STORE_IN_SEGMENT_FILE property key + @JsonIgnore
  isStoreInSegmentFile() accessor. The flag is carried inside the
  properties map; the accessor is intentionally not a JSON property
  so the on-wire shape of VectorIndexConfig is unchanged.
- VectorIndexType.ReaderFactory: flag-aware dispatch; HNSW unchanged.
- IvfFlat/IvfPq/IvfOnDiskVectorIndexReader: new (col, buffer, config,
  boolean ownsBuffer) constructor overload. close() releases the
  buffer only when the reader owns it; borrowed buffers (segment
  directory owned in the consolidated path) outlive the reader.
  Constructor-failure cleanup symmetric.
- VectorIndexHandler: needUpdateIndices detects sidecar+flag mismatch
  on V3 segments. updateIndices absorbs sidecar -> columns.psf
  without rebuilding the index when only the layout changed. The
  freshly-built-index path consolidates when the flag is on. Shared
  absorb helper includes crash-recovery: if a previous absorb run
  committed bytes into columns.psf but died before deleting the
  sidecar, the duplicate-key error is caught and the orphan sidecar
  is cleaned up instead of failing the segment load.
- SingleFileIndexDirectory: fix three symmetry bugs in the vector
  handling — hasIndexFor, getColumnsWithIndex, and removeIndex now
  treat vector as 'sidecar OR _columnEntries', not just sidecar.

Tests:

- VectorIndexHandlerTest: 4 new tests covering sidecar+flag-on
  trigger, sidecar+flag-off no-op, the absorb path not calling
  removeIndex, and the crash-recovery path that cleans the orphan
  sidecar without re-creating the consolidated entry.
- IvfReaderFailurePathTest: 4 new tests proving borrowed buffers
  (ownsBuffer=false) are not closed by the reader on success or on
  constructor failure, asserted via
  PinotDataBuffer.getMmapBufferCount().
- IvfFlatConsolidatedVectorTest: new cluster integration test that
  declares storeInSegmentFile=true, builds a real segment, and runs
  a vectorSimilarity query asserting ordered ANN results — proves
  end-to-end (write -> consolidate -> read -> query) on a live
  cluster.
Phase 2 already handled the sidecar -> columns.psf direction (any flag
OFF -> ON transition triggers absorb at the next load). This commit adds
the reverse: columns.psf -> sidecar when the flag flips from ON to OFF.

Without this, flipping the flag back to OFF leaves the segment with the
consolidated entry only. The reader factory (with flag off) tries to
find a sidecar that doesn't exist, returns null, and the vector index
becomes silently invisible to queries — a foot-gun for operators rolling
back the flag.

Detection (VectorIndexHandler.needUpdateIndices):

- Sidecar -> columns.psf: existing sidecar + flag on (already in place;
  comment updated to reflect that it covers both new-build and flag-flip
  cases).
- columns.psf -> sidecar: existingBackend == null (sidecar absent) AND
  hasIvfSidecar == false AND column reported by getColumnsWithIndex (so
  bytes live inside _columnEntries) AND flag off.

Extraction (VectorIndexHandler.extractConsolidatedIntoSidecar):

1. Read the consolidated buffer via segmentWriter.getIndexFor.
2. Stream bytes (1 MiB chunks) to a temp file with a non-recognised
   extension (.vector.extract-tmp) so that the subsequent removeIndex's
   sidecar cleanup does not accidentally delete the bytes we just wrote.
3. Call segmentWriter.removeIndex(col, vector()) to drop the columns.psf
   entry.
4. Rename the temp file to the final IVF sidecar path. Falls back to
   copy+delete if rename fails (cross-FS in tests).

Tests:

- testNeedUpdateIndicesReturnsTrueWhenConsolidatedExistsButFlagWantsSidecar:
  asserts the handler detects the consolidated+flag-off mismatch.
- testUpdateIndicesExtractsConsolidatedBytesIntoSidecar: end-to-end
  execution of the extract path. Asserts the sidecar exists, holds the
  exact bytes from the mocked consolidated buffer, removeIndex was called,
  and the temp file was cleaned up.
Builds on the prior commits. Adds a parallel storage path that mirrors
how text-index lands inside columns.psf at offline-build time, so
segments built with storeInSegmentFile=true arrive on servers already
consolidated and don't have to be migrated at first load.

Pattern (same as text):

- IVF creator decides which on-disk file format to write based on the
  flag — legacy single-file sidecar (.vector.ivfflat.index /
  .vector.ivfpq.index) when off, combined-form single file
  (.vector.ivfflat.combined.index / .vector.ivfpq.combined.index) when
  on. Bytes are byte-identical in both forms; only the file name
  differs, and the name acts as the signal for the converter.
- SegmentV1V2ToV3FormatConverter mirrors its existing text branch for
  vector: if VectorIndexUtils.hasVectorIndex returns true (legacy
  sidecar / HNSW directory present) it skips the standard copy and
  leaves the file as a v3 sibling (copyVectorIndexIfExists handles it).
  If hasVectorIndex returns false but a combined-form file exists,
  v2DataReader.getIndexFor finds it via VectorIndexType.getFileExtensions
  and the standard copyIndexIfExists loop packs the bytes into
  columns.psf as a typed entry.
- VectorIndexHandler's reload-time absorb/extract paths from the
  previous commit remain in place as a fallback for flag toggles
  without rebuild and for segments built before the flag was added.

Files:

- V1Constants.Indexes: add VECTOR_IVF_FLAT_COMBINED_INDEX_FILE_EXTENSION
  and VECTOR_IVF_PQ_COMBINED_INDEX_FILE_EXTENSION constants.
- VectorIndexType.getFileExtensions: include the new combined
  extensions so FilePerIndexDirectory.getFileFor can mmap them.
- VectorIndexUtils: getIndexFileExtension(backend, combined) overload;
  cleanupVectorIndex also removes combined-form transient files;
  hasVectorIndex stays legacy-only (mirrors text's hasTextIndex which
  only detects the legacy directory form) and is now public so the
  converter can call it.
- IvfFlatVectorIndexCreator / IvfPqVectorIndexCreator: thread
  storeInSegmentFile from VectorIndexConfig and pick the output
  extension at seal() time.
- SegmentV1V2ToV3FormatConverter: replace the vector-exclusion at line
  156 with the text-pattern branch (skip standard copy when legacy
  present; pack via standard copy when only combined present).
- VectorIndexHandler: createVectorIndexForColumn picks the right
  extension via VectorIndexUtils.getIndexFileExtension(backend,
  writeCombined); absorbSidecarIntoColumnsPsf prefers the combined
  file (just-written by the creator) and falls back to legacy;
  hasIvfSidecar detects either form so flag-on migration triggers
  whichever the writer left behind.

Tests:

- IvfDiskFormatInspectionTest: 2 new tests
  (testIvfFlatCreatorWritesCombinedExtensionWhenFlagOn and the IVF_PQ
  analog) lock in that flag-on creators emit the combined extension
  and that the header bytes are byte-identical to the legacy form.
- 111 pre-existing IVF + handler + SingleFileIndexDirectory tests
  still pass — no behavior regression for the flag-off path.

Total: 113/113 targeted tests green. HNSW consolidation still
deferred (its consolidation path is a separate refactor that mirrors
LuceneTextIndexCreator.combineAndCleanupTextIndexFiles).
User-requested terminology change. The helper class that opens a
single-file IVF index from disk as a PinotDataBuffer is now
IvfCombinedBuffers (was IvfSidecarBuffers). Methods renamed to match:

- mapSidecar / mapSidecarFile → mapCombined / mapCombinedFile
- hasIvfSidecar → hasIvfCombinedFile
- absorbSidecarIntoColumnsPsf → absorbCombinedIntoColumnsPsf
- extractConsolidatedIntoSidecar → extractConsolidatedToLegacyFile
  (renamed for accuracy — the method writes back to the legacy
  on-disk extension when the flag flips from on to off)

Comments and string literals were updated case-by-case: identifiers
referring to the combined-form file format use "combined"; prose
that previously read "sidecar file" is preserved where it described
the concept of a single-file index on disk alongside columns.psf.

No functional change. 113/113 targeted tests still pass. Spotless,
checkstyle, and license check all clean.
…e convert, crash-recovery

Cross-cutting fixes after audit of the storeInSegmentFile path. All five address edges
that only surface on crash, manual edit, or a flag toggle; the happy path is unchanged.

- SegmentDirectoryPaths.findVectorIndexIndexFile now also probes the .combined.index
  extension. Previously a segment built with flag=true that had not yet been absorbed
  into columns.psf was invisible to the reader factory on a flag rollback, so ANN
  queries silently degraded to brute-force scan.
- V2->V3 converter resolves mixed state (both legacy and combined files present)
  deterministically: combined wins, legacy sibling is dropped. Previously the combined
  bytes were silently lost.
- VectorIndexHandler.createVectorIndexForColumn now sweeps orphans from the OTHER
  extension before starting. A crash with flag=true followed by a retry with flag=off
  used to leak a partial .combined.index + .inprogress marker that a later flag flip
  could pick up as if it were a complete index. Extracted to a package-private static
  helper for direct unit testing.
- absorbCombinedIntoColumnsPsf now detects the previously-crashed-absorb state via
  hasIndexFor + size check on the typed entry, instead of string-matching the
  duplicate-key RuntimeException message. Size mismatch now refuses with IOException
  instead of guessing which copy is authoritative.
- extractConsolidatedToLegacyFile now refuses (IOException) if a sidecar already
  exists for either extension. removeIndex would otherwise destroy it via
  cleanupVectorIndex's side effect.

New tests (12): VectorIndexHandlerTest covers the new crash-recovery hasIndexFor path,
the size-mismatch refusal, the extract-refusal-when-sidecar-exists path, and the
orphan sweep in both directions plus an HNSW no-op assertion. VectorIndexUtilsTest
covers hasVectorIndex excluding the combined form and the new hasCombinedFormVectorIndex
predicate. SegmentDirectoryPathsTest covers combined-form lookup when only combined
exists and the legacy-wins rule in mixed state.

118 unit tests + 2 integration tests green. Spotless / checkstyle / license clean.
Mirrors the text-index pattern: pack the Lucene HNSW directory into a single
.vector.hnsw.combined.index file using the same LUCENE_V2 layout, then absorb
into columns.psf via the V2->V3 converter. At read time the bytes are exposed
as a Lucene Directory backed by a PinotDataBuffer slice, so KnnVectorsReader
works against the consolidated entry without re-extracting to disk.

- HnswVectorIndexCombined: packs and (for the extract path) unpacks the Lucene
  HNSW directory using LuceneCombinedTextIndexConstants. Includes the docId
  mapping file when present so the consolidated form is self-contained.
- HnswVectorIndexBufferReader: parses the combined buffer and returns a
  Directory built on PinotBufferLuceneDirectory + PinotBufferIndexInput
  (reused from the text-index package). Also exposes the mapping bytes via
  extractDocIdMappingBuffer.
- HnswVectorIndexCreator gains a storeInSegmentFile flag; close() packs the
  Lucene directory into the combined file and deletes the original directory
  when the flag is on.
- HnswVectorIndexReader gains a PinotDataBuffer-based constructor that opens
  the buffer-backed Directory and builds the DocIdTranslator from the mapping
  bytes embedded in the combined buffer (or rebuilds in heap if absent). The
  buffer-backed reader honours the borrowed-buffer contract IVF readers use
  (close() does not close the buffer).
- VectorIndexType.ReaderFactory branches on isStoreInSegmentFile for HNSW
  too, mirroring the IVF branch already in place.
- VectorIndexHandler removes the HNSW short-circuits in needUpdateIndices,
  updateIndices, absorb, extract, and cleanOrphansFromOtherExtension. Absorb
  for HNSW first packs the legacy Lucene directory into a transient combined
  file, then absorbs and deletes it (and the directory). Extract for HNSW
  unpacks the combined entry into a Lucene directory before removeIndex runs.

Tests: HnswVectorIndexCombinedTest covers pack/unpack round-trip plus a real
HNSW build read via the buffer-backed Directory. VectorIndexHandlerTest adds
HNSW-side migration and orphan-cleanup tests. VectorIndexUtilsTest and
SegmentDirectoryPathsTest cover the HNSW combined extension lookup.

125 unit tests green, IvfFlatConsolidatedVectorTest (2 tests) still green.
Spotless / checkstyle / license clean.
Two pre-PR hardening fixes:

extractHnswConsolidatedToDirectory now unpacks the combined buffer back into a
Lucene directory BEFORE removing the consolidated entry. The previous order
left a crash window in which removeIndex succeeded but the unpack threw — the
typed entry was gone and the Lucene directory did not yet exist, so the
segment permanently lost its HNSW index until a full rebuild. With the new
order a crash mid-unpack leaves the typed entry in columns.psf, the next load
retries the extract, and no data is lost. The new code also best-effort
cleans up a half-unpacked directory on failure so the retry's defensive
pre-flight check still fires.

If removeIndex throws AFTER unpack succeeded, the segment ends up with both
forms on disk. The reader factory's file-backed path (flag=false) uses the
Lucene directory and ignores the orphan typed entry until a subsequent
handler run cleans it up — no data loss, no behaviour regression.

Also drops the unused createdTransient local in absorbHnswIntoColumnsPsf
(set but never read; cleanup of hnswDir is already conditioned on
hnswDir.exists()).

130 unit tests still green, spotless/checkstyle/license clean.
…// docs

Resolve the SingleFileIndexDirectory.hasIndexFor contract concern from review
without breaking the generic load path:

- hasIndexFor(col, vector()) stays "legacy sidecar OR typed entry": the generic
  PhysicalColumnIndexContainer load gate relies on it, so a legacy sidecar must
  report true or it would never build a reader. Callers that specifically need
  to know whether the payload is already packed into columns.psf now probe the
  typed entry directly via VectorIndexUtils.getConsolidatedVectorEntry():
  - VectorIndexHandler absorb crash-recovery (IVF + HNSW): a normal first absorb
    of an existing legacy segment no longer mis-detects "already absorbed" and
    throws on getIndexFor().size(). Regression test added.
  - VectorIndexType reader factory: a not-yet-migrated HNSW segment falls back
    to the legacy Lucene directory instead of failing the whole segment load.
- SegmentV1V2ToV3FormatConverter: pack the combined form into columns.psf only
  when the resolved on-disk artifact is a regular file. A legacy HNSW Lucene
  *directory* (which getIndexFor resolves before the combined file) is deferred
  to the sibling copy to avoid a directory-as-buffer mapping failure, while a
  legacy IVF *file* still packs so the index is never lost in a mixed state.
- HnswVectorIndexCombined: open the combined output with TRUNCATE_EXISTING.
- VectorIndexConfig: document that storeInSegmentFile covers IVF and HNSW.

Convert the touched/new vector-index Javadoc to /// markdown doc comments.
extractDocIdMappingBuffer returned a view that inherited the big-endian order of
the enclosing columns.psf buffer, but the docId mapping is packed verbatim from
the little-endian file-backed sidecar that DocIdTranslator writes/reads. getInt()
therefore byte-swapped every Lucene->Pinot doc id for consolidated HNSW segments
whenever the mapping was packed, corrupting ANN result translation and pre-filter
matching. Force a LITTLE_ENDIAN view. Adds a regression test that packs a known
mapping and reads it back through a big-endian columns.psf-style buffer.
- VectorIndexType: drop {@code} tags from a // line comment (they only render in Javadoc)
- HnswVectorIndexReader: drop redundant self-qualified DocIdTranslator reference
- VectorIndexHandler: remove no-op deleteQuietly(combinedFile) in the HNSW absorb path
  (writeIndexToV3Format already force-deletes the source) and note the IVF/HNSW asymmetry
Sibling of IvfFlatConsolidatedVectorTest for the HNSW backend, which had no
coverage on the consolidated read path. Exercises the buffer-backed
HnswVectorIndexReader (Lucene directory rebuilt from the packed columns.psf
entry) and the little-endian docId mapping consumed by DocIdTranslator, asserting
a vectorSimilarity ANN query returns L2-ascending top-K candidates.
xiangfu0 added 4 commits July 4, 2026 14:36
- HNSW extract (columns.psf -> Lucene directory): unpack into a temp
  directory and move it into place AFTER removeIndex, so the
  cleanupVectorIndex sibling sweep run by removeIndex can no longer delete
  the freshly-unpacked legacy .vector.v912.hnsw.index directory.
- IVF reader factory: fall back to the on-disk sidecar when
  storeInSegmentFile=true but no consolidated columns.psf entry exists yet
  (mirrors the HNSW fallback) instead of returning null and forcing an
  exact scan; derive buffer ownership from the buffer source rather than
  the config flag, fixing a latent sidecar mmap leak.
- HnswVectorIndexCombined.extractHnswIndexFiles: throw on a zero-progress
  transferTo instead of spinning forever on a corrupt/truncated file.
- getConsolidatedVectorEntry: map only the shared "index not found" signal
  to null and rethrow any other RuntimeException so corruption is not
  masked as "absent". Hoist the message prefix into a single shared
  constant in SingleFileIndexDirectory, referenced by FilePerIndexDirectory
  and VectorIndexUtils so the produce/match sides cannot drift.
- HnswVectorIndexReader buffer-backed ctor: log the throwable, not just
  its message.
- Tests: HNSW extract directory-preservation, truncated-combined-file
  guard, IVF sidecar fallback, getConsolidatedVectorEntry null/rethrow,
  and the cleanupVectorIndex temp-name invariant.
…ardening

- sniffVectorPayloadBackend / detectConsolidatedVectorBackend classify a
  consolidated columns.psf payload by its leading magic bytes (LUCENE_V2 /
  IVFF / IVPQ), so backend drift can be detected for columns whose payload
  no longer has an on-disk sidecar.
- storageFormatOf normalizes IVF_ON_DISK to IVF_FLAT: the two backends
  share one file format (only the reader differs), so drift comparisons
  must compare storage formats or a healthy IVF_ON_DISK column would
  report perpetual drift and be destroyed and rebuilt on every load.
- getConsolidatedVectorEntry now maps FilePerIndexDirectory's
  "not a regular file" IllegalArgumentException (a still-present legacy
  Lucene DIRECTORY resolved as the vector artifact on V1/V2 segments) to
  "no consolidated entry" instead of failing the caller's segment load;
  the message suffix is a shared constant so producer and matcher cannot
  drift. Contract Javadoc rewritten to state the per-store behavior
  honestly (V1/V2 may return an IVF sidecar's bytes; callers gate on
  version or on-disk artifacts).
A column whose vector payload lives solely in columns.psf has no sidecar
for detectVectorIndexBackend to see, so existingBackend == null and a
config backend flip slipped past the drift check: the extract path would
stream the OLD backend's bytes out under the NEW backend's extension
(e.g. HNSW pack bytes as .vector.ivfpq.index) and the next load parsed
them with the wrong reader.

needUpdateIndices and updateIndices now share getDriftedConsolidatedBackend,
which sniffs the typed entry's magic (V3 only) and reports a rebuild when
the payload's storage format differs from the configured backend's. All
four drift comparisons (sidecar and consolidated, check and act) compare
via storageFormatOf so a valid IVF_ON_DISK column (whose payload sniffs
as the shared IVF_FLAT format) is not flagged as drifted and rebuilt on
every segment load.
With storeInSegmentFile=true, the HNSW reader factory probed the
consolidated columns.psf entry first. On a V1/V2 segment backed by
FilePerIndexDirectory, getIndexFor resolves the still-present legacy
Lucene DIRECTORY and mapForReads rejects it (not a regular file), killing
the segment load before the legacy fallback could run — making the flag
non-rolling-upgrade-safe for existing HNSW segments.

The factory now checks the on-disk artifact first: if it is a Lucene
directory (unmigrated V3, or any V1/V2 segment), read it directly and
skip the probe entirely. The fallback warn now describes the actual
state (neither form found) instead of promising a directory read that
cannot succeed. Tests cover both the V1 root layout and the v3/
subdirectory layout, asserting the probe never runs.
@xiangfu0 xiangfu0 force-pushed the vector-readers-pinot-buffer branch from 4cef636 to df17108 Compare July 4, 2026 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

index Related to indexing (general) vector Related to vector similarity search

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants