Background
LittleEndianDataOutputStream.writeInt(int) and writeShort(int) decompose the value byte-by-byte and call out.write(int) for each byte:
public final void writeInt(int v) throws IOException {
out.write((v >>> 0) & 0xFF);
out.write((v >>> 8) & 0xFF);
out.write((v >>> 16) & 0xFF);
out.write((v >>> 24) & 0xFF);
}
When the underlying stream is CapacityByteArrayOutputStream (the typical case in Parquet writers), each out.write(int) performs a hasRemaining check, a Math.addExact for the new size, possibly a slab-grow check, and a single-byte store. For writeInt, that's 4 separate trips through the bookkeeping.
The class already has the right pattern in writeLong: build a writeBuffer[] and emit a single out.write(writeBuffer, 0, 8). The buffer is even pre-allocated for that purpose. writeInt and writeShort just don't use it.
There's a TODO comment in writeInt (lines 147–149 in current master) acknowledging this:
// TODO: see note in LittleEndianDataInputStream: maybe faster
// to use Integer.reverseBytes() and then writeInt, or a ByteBuffer
// approach
Proposal
Extend the existing writeBuffer[] pattern to writeInt and writeShort:
public final void writeInt(int v) throws IOException {
writeBuffer[0] = (byte) (v >>> 0);
writeBuffer[1] = (byte) (v >>> 8);
writeBuffer[2] = (byte) (v >>> 16);
writeBuffer[3] = (byte) (v >>> 24);
out.write(writeBuffer, 0, 4);
}
This collapses 4 write(int) calls into 1 write(byte[], int, int) call, cutting the bookkeeping overhead by ~4x per int. Matches the existing writeLong pattern in the same file.
Resolves the existing TODO in the source.
Expected impact
Standalone JMH benchmark of the class:
IntEncodingBenchmark.encodePlain (when routed through LittleEndianDataOutputStream): ~+35% (~20.9M → ~28.2M ops/s)
Note on context
PR #3496 deprecates LittleEndianDataOutputStream because Parquet's own writers no longer use it (they write directly into ByteBuffer-backed slabs, which compiles to a single intrinsic store on little-endian architectures and is strictly faster than any wrapper).
This PR is therefore a purely external-caller benefit: any third-party Parquet-format producer still using the class will get the speedup until they migrate. The change is minimal (~10 lines), obviously correct (matches the existing writeLong pattern), and resolves a long-standing TODO in the source.
Files affected
parquet-common/src/main/java/org/apache/parquet/bytes/LittleEndianDataOutputStream.java
No public API change.
Background
LittleEndianDataOutputStream.writeInt(int)andwriteShort(int)decompose the value byte-by-byte and callout.write(int)for each byte:When the underlying stream is
CapacityByteArrayOutputStream(the typical case in Parquet writers), eachout.write(int)performs ahasRemainingcheck, aMath.addExactfor the new size, possibly a slab-grow check, and a single-byte store. ForwriteInt, that's 4 separate trips through the bookkeeping.The class already has the right pattern in
writeLong: build awriteBuffer[]and emit a singleout.write(writeBuffer, 0, 8). The buffer is even pre-allocated for that purpose.writeIntandwriteShortjust don't use it.There's a
TODOcomment inwriteInt(lines 147–149 in current master) acknowledging this:Proposal
Extend the existing
writeBuffer[]pattern towriteIntandwriteShort:This collapses 4
write(int)calls into 1write(byte[], int, int)call, cutting the bookkeeping overhead by ~4x per int. Matches the existingwriteLongpattern in the same file.Resolves the existing
TODOin the source.Expected impact
Standalone JMH benchmark of the class:
IntEncodingBenchmark.encodePlain(when routed throughLittleEndianDataOutputStream): ~+35% (~20.9M → ~28.2M ops/s)Note on context
PR #3496 deprecates
LittleEndianDataOutputStreambecause Parquet's own writers no longer use it (they write directly intoByteBuffer-backed slabs, which compiles to a single intrinsic store on little-endian architectures and is strictly faster than any wrapper).This PR is therefore a purely external-caller benefit: any third-party Parquet-format producer still using the class will get the speedup until they migrate. The change is minimal (~10 lines), obviously correct (matches the existing
writeLongpattern), and resolves a long-standingTODOin the source.Files affected
parquet-common/src/main/java/org/apache/parquet/bytes/LittleEndianDataOutputStream.javaNo public API change.