New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve decoding of short decimals in parquet reader #14260
Conversation
2fff93c
to
8ec2b41
Compare
8ec2b41
to
e3e9223
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as for the other one? Do we need to introduce feature flag for optimized reader?
Since this is a targeted improvement within existing code and not a big rewrite, it doesn't need to be behind a new flag. |
lib/trino-parquet/src/main/java/io/trino/parquet/reader/ShortDecimalValueDecoder.java
Outdated
Show resolved
Hide resolved
lib/trino-parquet/src/main/java/io/trino/parquet/reader/ShortDecimalColumnReader.java
Outdated
Show resolved
Hide resolved
lib/trino-parquet/src/main/java/io/trino/parquet/reader/ShortDecimalValueDecoder.java
Outdated
Show resolved
Hide resolved
14b0638
to
d082aa2
Compare
d082aa2
to
3c8bff1
Compare
lib/trino-parquet/src/test/java/io/trino/parquet/reader/BenchmarkShortDecimalColumnReader.java
Outdated
Show resolved
Hide resolved
lib/trino-parquet/src/test/java/io/trino/parquet/reader/BenchmarkShortDecimalColumnReader.java
Outdated
Show resolved
Hide resolved
lib/trino-parquet/src/test/java/io/trino/parquet/reader/BenchmarkShortDecimalColumnReader.java
Outdated
Show resolved
Hide resolved
lib/trino-parquet/src/test/java/io/trino/parquet/reader/BenchmarkShortDecimalColumnReader.java
Outdated
Show resolved
Hide resolved
lib/trino-parquet/src/main/java/io/trino/parquet/reader/ShortDecimalColumnReader.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced this is a good change. It might be better than what's there today, but I think there other less complex ways to gain performance in that area.
To illustrate, here are some benchmarks using bulk warmup mode:
These are the results for the implementation in trunk:
Benchmark (byteArrayLength) Mode Cnt Score Error Units
BenchmarkShortDecimalColumnReader.read 1 thrpt 5 7.306 ± 0.138 ops/s
BenchmarkShortDecimalColumnReader.read 2 thrpt 5 7.291 ± 0.044 ops/s
BenchmarkShortDecimalColumnReader.read 3 thrpt 5 7.076 ± 0.102 ops/s
BenchmarkShortDecimalColumnReader.read 4 thrpt 5 6.881 ± 0.029 ops/s
BenchmarkShortDecimalColumnReader.read 5 thrpt 5 6.891 ± 0.084 ops/s
BenchmarkShortDecimalColumnReader.read 6 thrpt 5 6.850 ± 0.091 ops/s
BenchmarkShortDecimalColumnReader.read 7 thrpt 5 6.681 ± 0.100 ops/s
BenchmarkShortDecimalColumnReader.read 8 thrpt 5 6.706 ± 0.073 ops/s
These are the results for this PR:
Benchmark (byteArrayLength) Mode Cnt Score Error Units
BenchmarkShortDecimalColumnReader.read 1 thrpt 5 8.631 ± 0.237 ops/s
BenchmarkShortDecimalColumnReader.read 2 thrpt 5 8.681 ± 0.169 ops/s
BenchmarkShortDecimalColumnReader.read 3 thrpt 5 8.491 ± 0.164 ops/s
BenchmarkShortDecimalColumnReader.read 4 thrpt 5 8.536 ± 0.089 ops/s
BenchmarkShortDecimalColumnReader.read 5 thrpt 5 8.428 ± 0.171 ops/s
BenchmarkShortDecimalColumnReader.read 6 thrpt 5 8.465 ± 0.125 ops/s
BenchmarkShortDecimalColumnReader.read 7 thrpt 5 8.386 ± 0.473 ops/s
BenchmarkShortDecimalColumnReader.read 8 thrpt 5 8.796 ± 0.129 ops/s
These are the results for a much simpler change: fc84465
Benchmark (byteArrayLength) Mode Cnt Score Error Units
BenchmarkShortDecimalColumnReader.read 1 thrpt 5 9.259 ± 0.116 ops/s
BenchmarkShortDecimalColumnReader.read 2 thrpt 5 9.363 ± 0.130 ops/s
BenchmarkShortDecimalColumnReader.read 3 thrpt 5 9.040 ± 0.083 ops/s
BenchmarkShortDecimalColumnReader.read 4 thrpt 5 9.092 ± 0.091 ops/s
BenchmarkShortDecimalColumnReader.read 5 thrpt 5 8.971 ± 0.118 ops/s
BenchmarkShortDecimalColumnReader.read 6 thrpt 5 8.739 ± 0.210 ops/s
BenchmarkShortDecimalColumnReader.read 7 thrpt 5 8.711 ± 0.183 ops/s
BenchmarkShortDecimalColumnReader.read 8 thrpt 5 8.602 ± 0.133 ops/s
3c8bff1
to
b3997d7
Compare
b3997d7
to
dcd58ef
Compare
Benchmark (byteArrayLength) Mode Cnt Score Before Score After Units BenchmarkShortDecimalColumnReader.read 1 thrpt 45 4.979 ± 0.097 6.026 ± 0.084 ops/s BenchmarkShortDecimalColumnReader.read 2 thrpt 45 4.828 ± 0.187 5.868 ± 0.068 ops/s BenchmarkShortDecimalColumnReader.read 3 thrpt 45 4.033 ± 0.247 5.700 ± 0.132 ops/s BenchmarkShortDecimalColumnReader.read 4 thrpt 45 4.324 ± 0.147 5.735 ± 0.088 ops/s BenchmarkShortDecimalColumnReader.read 5 thrpt 45 4.402 ± 0.156 5.598 ± 0.089 ops/s BenchmarkShortDecimalColumnReader.read 6 thrpt 45 4.226 ± 0.178 5.033 ± 0.231 ops/s BenchmarkShortDecimalColumnReader.read 7 thrpt 45 4.070 ± 0.148 5.055 ± 0.300 ops/s BenchmarkShortDecimalColumnReader.read 8 thrpt 45 3.960 ± 0.198 5.226 ± 0.197 ops/s Co-authored-by: Martin Traverso <mtraverso@gmail.com>
dcd58ef
to
a90599a
Compare
Description
Improves performance by using simpler dedicated decoding logic for each possible byte array length
Non-technical explanation
Improves performance of reading short decimal columns from parquet files
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: