Eliminate float[] and ByteBuffer allocations in compaction inline-record path by eolivelli · Pull Request #16 · eolivelli/jvector

eolivelli · 2026-05-22T11:42:10Z

Summary

Profiling a HerdDB indexing workload showed CompactWriter.writeInlineNodeRecord accounting for 24% of alloc-event bytes:

Site	% of total alloc	Cause
`writeInlineNodeRecord:222`	21.4%	`pq.encodeTo` → `VectorUtil.sub(vector, globalCentroid)` → fresh `ArrayVectorFloat` (`float[]`) per neighbor
`writeInlineNodeRecord:248`	2.4%	`bwriter.cloneBuffer()` → `ByteBuffer.allocate(recordSize)` per record

ArrayVectorFloat.<init> was the single largest allocator in the whole profile (20.6% self) and fully attributable to the PQ encode call inside compaction.

This PR removes both allocations on this hot path.

Changes

ProductQuantization.encodeTo(vector, scratch, dest) — new 3-arg overload that, when a global centroid is configured, uses VectorUtil.subInto to write the centered vector into a caller-supplied scratch buffer instead of allocating a fresh one. The existing 2-arg overload keeps its allocating behavior (it's the VectorCompressor.encodeTo implementation; all other callers stay unchanged).
OnDiskGraphIndexCompactor.processBaseNode — passes scratch.tmpVec as the scratch buffer. tmpVec is per-worker, dimension-sized, and already dead by the time retainDiverse returns, so no extra allocation is needed.
CompactWriter.setInlineChannel(FileChannel) + direct positional writes — the level-0 FileChannel is plumbed into CompactWriter, and writeInlineNodeRecord writes the per-thread record buffer directly to disk via FileChannel.write(ByteBuffer, long). This positional API is thread-safe and lets us drop ByteBufferIndexWriter.cloneBuffer and the WriteResult.data field; the level-0 consumer in compactLevels is now a no-op since records hit disk inside the worker.

The on-disk byte layout at level 0 is unchanged, so existing compacted indexes remain readable.

Test plan

mvn -pl jvector-tests -am test — 275 tests pass, 0 failures / 0 errors (2 pre-existing skipped).
mvn -pl jvector-tests -am -Dtest='io.github.jbellis.jvector.graph.disk.*Test*,TestProductQuantization,TestPQRetrainer*' test — focused on the touched code paths (42 tests pass).
Re-profile the HerdDB indexing workload and confirm ArrayVectorFloat.<init> drops to near zero in the compactor stack and HeapByteBuffer.<init> drops the 2.3% attributable to cloneBuffer.

Generated with Claude Code

…ord path Allocation profiling of a HerdDB indexing run showed CompactWriter.writeInlineNodeRecord accounting for 24% of alloc-event bytes: a per-neighbor float[] inside ProductQuantization.encodeTo (21%) and a per-record ByteBuffer.allocate via ByteBufferIndexWriter.cloneBuffer (2.4%). This change removes both: - Add ProductQuantization.encodeTo(vector, scratch, dest) that uses VectorUtil.subInto into a caller-provided buffer when a global centroid is configured. The existing 2-arg overload keeps its allocating behavior. The compactor passes Scratch.tmpVec (per-worker, dimension sized, already dead after retainDiverse returns) as the scratch. - Plumb the level-0 FileChannel into CompactWriter via setInlineChannel and write the per-thread record buffer directly to disk from the worker. Drops ByteBufferIndexWriter.cloneBuffer and the WriteResult.data field; the consumer in compactLevels is now a no-op since records hit disk inside the worker. FileChannel.write(ByteBuffer, long) is positional and thread-safe. 275 jvector-tests pass; the level-0 byte layout is unchanged so existing compacted indexes remain readable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

eolivelli merged commit f9b0085 into main May 22, 2026
3 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eliminate float[] and ByteBuffer allocations in compaction inline-record path#16

Eliminate float[] and ByteBuffer allocations in compaction inline-record path#16
eolivelli merged 1 commit into
mainfrom
zero-alloc-write-inline-node-record

eolivelli commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eolivelli commented May 22, 2026

Summary

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant