Optimize LittleEndianDataOutputStream.writeInt/writeShort with bulk write (resolves existing TODO)

## Background

`LittleEndianDataOutputStream.writeInt(int)` and `writeShort(int)` decompose the value byte-by-byte and call `out.write(int)` for each byte:

```java
public final void writeInt(int v) throws IOException {
  out.write((v >>> 0) & 0xFF);
  out.write((v >>> 8) & 0xFF);
  out.write((v >>> 16) & 0xFF);
  out.write((v >>> 24) & 0xFF);
}
```

When the underlying stream is `CapacityByteArrayOutputStream` (the typical case in Parquet writers), each `out.write(int)` performs a `hasRemaining` check, a `Math.addExact` for the new size, possibly a slab-grow check, and a single-byte store. For `writeInt`, that's **4 separate trips** through the bookkeeping.

The class already has the right pattern in `writeLong`: build a `writeBuffer[]` and emit a single `out.write(writeBuffer, 0, 8)`. The buffer is even pre-allocated for that purpose. `writeInt` and `writeShort` just don't use it.

There's a `TODO` comment in `writeInt` (lines 147–149 in current master) acknowledging this:

```
// TODO: see note in LittleEndianDataInputStream: maybe faster
// to use Integer.reverseBytes() and then writeInt, or a ByteBuffer
// approach
```

## Proposal

Extend the existing `writeBuffer[]` pattern to `writeInt` and `writeShort`:

```java
public final void writeInt(int v) throws IOException {
  writeBuffer[0] = (byte) (v >>> 0);
  writeBuffer[1] = (byte) (v >>> 8);
  writeBuffer[2] = (byte) (v >>> 16);
  writeBuffer[3] = (byte) (v >>> 24);
  out.write(writeBuffer, 0, 4);
}
```

This collapses 4 `write(int)` calls into 1 `write(byte[], int, int)` call, cutting the bookkeeping overhead by ~4x per int. Matches the existing `writeLong` pattern in the same file.

Resolves the existing `TODO` in the source.

## Expected impact

Standalone JMH benchmark of the class:
- `IntEncodingBenchmark.encodePlain` (when routed through `LittleEndianDataOutputStream`): **~+35%** (~20.9M → ~28.2M ops/s)

## Note on context

PR #3496 deprecates `LittleEndianDataOutputStream` because Parquet's own writers no longer use it (they write directly into `ByteBuffer`-backed slabs, which compiles to a single intrinsic store on little-endian architectures and is strictly faster than any wrapper).

This PR is therefore a **purely external-caller benefit**: any third-party Parquet-format producer still using the class will get the speedup until they migrate. The change is minimal (~10 lines), obviously correct (matches the existing `writeLong` pattern), and resolves a long-standing `TODO` in the source.

## Files affected

- `parquet-common/src/main/java/org/apache/parquet/bytes/LittleEndianDataOutputStream.java`

No public API change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize LittleEndianDataOutputStream.writeInt/writeShort with bulk write (resolves existing TODO) #3518

Background

Proposal

Expected impact

Note on context

Files affected

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Optimize LittleEndianDataOutputStream.writeInt/writeShort with bulk write (resolves existing TODO) #3518

Description

Background

Proposal

Expected impact

Note on context

Files affected

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions