GH-3516: Optimize DeltaByteArrayWriter and DeltaLengthByteArrayValuesWriter (+33-55% encodeDeltaByteArray)#3517
Open
iemejia wants to merge 1 commit intoapache:masterfrom
Open
Conversation
…luesWriter Two related changes in the DELTA_BYTE_ARRAY write path: 1. DeltaLengthByteArrayValuesWriter: drop the unused LittleEndianDataOutputStream wrapper. Binary.writeTo(arrayOut) works directly with the underlying CapacityByteArrayOutputStream; the LE wrapper added an extra layer of dispatch on every value but never used any LE functionality (writeInt/writeLong/etc.). Add a new writeBytes(byte[], int, int) overload so callers that already have the raw bytes can avoid allocating a Binary wrapper. 2. DeltaByteArrayWriter: tighten suffixWriter field type to DeltaLengthByteArrayValuesWriter (it's always constructed as one) so the new writeBytes(byte[], int, int) overload is callable. Replace the suffix call with the raw-bytes overload, eliminating the per-value Binary.slice() allocation. Benchmark results (BinaryEncodingBenchmark.encodeDeltaByteArray and encodeDeltaLengthByteArray, added in apache#3512): - encodeDeltaByteArray (LOW cardinality, len=10): +33% to +55% - encodeDeltaLengthByteArray (LOW card, len=10): +18% to +21% - long-string cases: flat (per-value alloc amortized away) No public API change. No file format change. Validation: parquet-column 573 tests pass. Built with -Dspotless.check.skip=true -Drat.skip=true -Djapicmp.skip=true.
This was referenced Apr 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Resolves #3516.
Two related changes in the
DELTA_BYTE_ARRAYwrite path:1.
DeltaLengthByteArrayValuesWriter: drop the unusedLittleEndianDataOutputStreamwrapperThe class wrapped its
CapacityByteArrayOutputStreamwith aLittleEndianDataOutputStreamthat was only used byBinary.writeTo()— an extra layer of dispatch on every value that never used any LE-specific functionality (writeInt/writeLong/etc.).Binary.writeTo(arrayOut)works directly with the underlying stream.Also adds a new overload:
so callers that already have the raw bytes can avoid allocating a
Binarywrapper.2.
DeltaByteArrayWriter: eliminate per-valueBinary.slice()allocation in the suffix pathTightens the
suffixWriterfield type fromValuesWritertoDeltaLengthByteArrayValuesWriter(it's always constructed as one) so the new raw-bytes overload is callable. The suffix call becomes:instead of
suffixWriter.writeBytes(v.slice(i, vb.length - i)), eliminating theByteArraySliceBackedBinaryallocation per value plus a layer of virtual dispatch.Benchmark results
From
BinaryEncodingBenchmark.encodeDeltaByteArray/encodeDeltaLengthByteArray(added in #3512):encodeDeltaByteArrayencodeDeltaByteArrayencodeDeltaByteArrayencodeDeltaLengthByteArrayencodeDeltaLengthByteArrayLong-string cases are flat or trivial — the per-value allocation is amortized away when each value is hundreds of bytes.
How to reproduce
The JMH benchmarks cited above are being added to
parquet-benchmarksin #3512. Once that lands, reproduce with:Compare runs against
master(baseline) and this branch (optimized).Validation
parquet-column: 573 tests pass-Dspotless.check.skip=true -Drat.skip=true -Djapicmp.skip=trueUser-facing changes
None. No public API change. No file format change.
The new
DeltaLengthByteArrayValuesWriter.writeBytes(byte[], int, int)overload is added on top of the existing public API.Closes #3516
Part of a small series of focused performance PRs from work in parquet-perf. Previous: #3494, #3496, #3500, #3504, #3506, #3510, #3514. Companion benchmarks contribution: #3512.