Performance optimizations: Merged all LittleEndianDataInputStream functionality into ByteBufferInputStream #953

theosib-amazon · 2022-03-29T14:18:31Z

This PR is all performance optimization. In benchmarking with Trino, we find query performance to improve from 5% to 15%, depending on the query, and that includes all the I/O time from S3.

The main modification is to merge all of LittleEndianDataInputStream functionality into ByteBufferInputStream, which yields the following benefits:

Elimination of extra layers of abstraction and method call overhead
Enable the use of intrinsics for readInt, readLong, etc.
Availability of faster access methods like readFully and skipFully, without the need for helper functions
Reduces some object creation in the performance critical path

This also includes and enables performance optimizations to:

ByteBitPackingValuesReader
PlainValuesReader
RunLengthBitPackingHybridDecoder

Context:
I've been working on improving Parquet reading performance in Trino, mostly by profiling while running performance benchmarks and TPCDS queries. This PR is a subset of the changes I made that have more than doubled the performance of a lot of TPCDS queries (wall clock time, including the S3 access time). If you are kind enough to accept these changes, I have more I would like to contribute.

…nputStream Deprecated LittleEndianDataInputStream Optmized performance of: - ByteBitPackingValuesReader - PlainValuesReader - RunLengthBitPackingHybridDecoder - Optimized performance of readInt, readLong, and related methods

theosib-amazon · 2022-03-29T14:32:37Z

I forgot to add this to a comment in the code:
The reason PlainValuesReader still includes an unused LittleEndianDataInputStream member is because if I don't, the build will fail, indicating an incompatible API change.

...mn/src/main/java/org/apache/parquet/column/values/bitpacking/ByteBitPackingValuesReader.java

Undid whitespace changes

Fixed whitespace change

Undid whitespace changes

Undid whitespace change

Undid whitespace changes

parquet-common/src/main/java/org/apache/parquet/bytes/SingleBufferInputStream.java

rdblue · 2022-04-24T20:33:07Z

parquet-common/src/main/java/org/apache/parquet/bytes/SingleBufferInputStream.java

+  @Override
+  public void skipFully(long n) throws IOException {
+    try {
+      buffer.position(buffer.position() + (int)n);


Looks like this is just trying to avoid the checks that are being done in skip. I don't think that's a good idea. This should delegate to skip instead.

I did this specifically because it's performance-critical. I did a bunch of profiling, and skips are among the operations that have to have minimal overhead. Delegating to skip() would introduce a bunch of checks that the JIT isn't going to be smart enough to remove.

rdblue · 2022-04-24T20:34:25Z

parquet-common/src/main/java/org/apache/parquet/bytes/SingleBufferInputStream.java

@@ -174,4 +248,63 @@ public boolean markSupported() {
  public int available() {
    return buffer.remaining();
  }
+
+  @Override
+  public byte readByte() throws IOException {


The changes from here on out look like what you're really trying to do because we want to read directly from the stream. Can you remove all the other changes that aren't needed?

I'm not sure what you're referring to. All of the methods beyond this point are absolutely necessary. We need to be able to read ints and longs and such from the bytebuffer, and this is the only way to get them.

parquet-common/src/main/java/org/apache/parquet/bytes/MultiBufferInputStream.java

rdblue

Overall I think most of these changes are good, but there are a few things that should be done before committing this:

Revert any unnecessary changes, like new constructors and style changes that are non-functional (e.g. using x++ instead of x += 1)
Separate the ByteBufferInputStream additions into a dedicated PR with tests
Make real changes to PlainValuesReader rather than keeping both input streams and changing the reference to in2
Update for project style

Made ByteBuffer exceptions mode specific

Changed 255 to 0xFF

Reverted whitespace change

Changed ++ to += 1

Added blank line after control flow blocks (except in a few places where it would add a non-functional change to code I didn't edit).

theosib-amazon · 2022-04-25T18:18:17Z

Thanks for reviewing my PR. I made all the cosmetic changes you asked for.

I'm not sure why you're asking to separate the ByteBufferInputStream additions into its own PR, since the PR was all about improving performance by moving functionality from LittleEndianDataInputStream into ByteBufferInputStream. The changes to PlainValuesReader rely on all of those changes.

The only reason I kept the reference to LittleEndianDataInputStream in PlainValuesReader is because otherwise the build fails with a compatibility break against 1.12.0. I'm going to go ahead with the change in the hopes that that doesn't cause a check failure.

Got rid of reference to LittleEndianByteBufferInputStream

theosib-amazon · 2022-04-25T19:34:09Z

Also, you mentioned tests. Since I'm not making any functional changes, I'm not sure what to test for. The new code should behave exactly as the old version, just a bit faster.

Modified skip() and skipFully() to handle negative and out-of-range arguments. Made EOF exceptions preserve any error message.

theosib-amazon added 2 commits March 28, 2022 19:36

Fixed test failures

4420803

rdblue reviewed Apr 18, 2022

View reviewed changes

...mn/src/main/java/org/apache/parquet/column/values/bitpacking/ByteBitPackingValuesReader.java Outdated Show resolved Hide resolved

theosib-amazon added 7 commits April 18, 2022 20:49

Update ByteBitPackingValuesReader.java

d0a9fef

Undid whitespace changes

Update ByteBitPackingValuesReader.java

ab0876a

Fixed whitespace change

Update PlainValuesReader.java

8105024

Undid whitespace changes

Update RunLengthBitPackingHybridDecoder.java

ac496e7

Undid whitespace changes

Update TestRunLengthBitPackingHybridEncoder.java

adcf21d

Undid whitespace changes

Update SingleBufferInputStream.java

be18e6d

Undid whitespace change

Update LittleEndianDataInputStream.java

e38a5f2

Undid whitespace changes