Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize decoders for DELTA_BINARY_PACKED parquet encoding #15850

Merged
merged 6 commits into from Jan 30, 2023

Conversation

raunaqmorarka
Copy link
Member

@raunaqmorarka raunaqmorarka commented Jan 25, 2023

Description

Optimize decoders for DELTA_BINARY_PACKED parquet encoding

Additional context and related issues

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Hive, Hudi, Iceberg, Delta
* Improve performance of reading parquet files for numeric types. ({issue}`15850`)

raunaqmorarka and others added 6 commits January 27, 2023 02:42
Added optimized delta decoders for INTEGER and BIGINT trino types

Benchmark                      (bitWidth)   Mode  Cnt  Score before      Score after       Units
BenchmarkIntColumnReader.read           0  thrpt   20  117.071 ±  7.703  865.445 ± 20.106  ops/s
BenchmarkIntColumnReader.read           3  thrpt   20  104.166 ±  4.900  268.090 ± 42.779  ops/s
BenchmarkIntColumnReader.read           4  thrpt   20   98.350 ±  4.357  320.060 ± 45.229  ops/s
BenchmarkIntColumnReader.read           6  thrpt   20   64.612 ± 21.224  302.133 ± 43.548  ops/s
BenchmarkIntColumnReader.read           7  thrpt   20   99.094 ±  9.955  281.936 ± 43.971  ops/s
BenchmarkIntColumnReader.read           8  thrpt   20   95.858 ±  5.498  285.217 ± 41.485  ops/s
BenchmarkIntColumnReader.read          11  thrpt   20   88.263 ±  9.428  270.472 ± 44.151  ops/s
BenchmarkIntColumnReader.read          15  thrpt   20   81.247 ±  8.551  267.604 ± 44.837  ops/s
BenchmarkIntColumnReader.read          20  thrpt   20   66.282 ±  9.662  207.038 ± 35.746  ops/s
BenchmarkIntColumnReader.read          25  thrpt   20   77.153 ±  7.748  209.671 ± 36.137  ops/s
BenchmarkIntColumnReader.read          32  thrpt   20   92.607 ±  3.672  395.909 ± 47.766  ops/s

Benchmark                       (bitWidth)   Mode  Cnt  Score before      Score after       Units
BenchmarkLongColumnReader.read           0  thrpt   20  199.566 ± 15.208  806.087 ± 51.283  ops/s
BenchmarkLongColumnReader.read           4  thrpt   20  163.637 ± 10.026  579.107 ± 28.904  ops/s
BenchmarkLongColumnReader.read           8  thrpt   20  154.460 ±  4.185  538.973 ± 21.534  ops/s
BenchmarkLongColumnReader.read          10  thrpt   20  148.364 ±  2.676  513.435 ± 14.800  ops/s
BenchmarkLongColumnReader.read          15  thrpt   20  146.103 ±  6.923  514.479 ± 15.324  ops/s
BenchmarkLongColumnReader.read          20  thrpt   20  132.407 ±  6.520  442.656 ± 13.898  ops/s
BenchmarkLongColumnReader.read          25  thrpt   20  118.700 ±  7.232  421.344 ± 35.498  ops/s
BenchmarkLongColumnReader.read          30  thrpt   20  117.756 ±  1.767  404.390 ± 34.178  ops/s
BenchmarkLongColumnReader.read          35  thrpt   20  106.358 ±  1.739  318.364 ± 41.330  ops/s
BenchmarkLongColumnReader.read          40  thrpt   20   91.588 ±  4.196  346.890 ± 14.496  ops/s
BenchmarkLongColumnReader.read          45  thrpt   20   86.491 ±  3.393  322.405 ± 15.961  ops/s
BenchmarkLongColumnReader.read          50  thrpt   20   79.182 ±  1.353  308.200 ±  7.586  ops/s
BenchmarkLongColumnReader.read          55  thrpt   20   85.522 ±  1.183  296.408 ±  8.309  ops/s
BenchmarkLongColumnReader.read          60  thrpt   20   70.403 ±  1.877  276.162 ±  6.587  ops/s
BenchmarkLongColumnReader.read          64  thrpt   20   74.248 ±  3.132  524.803 ± 64.244  ops/s

Benchmark                                     (size)   Mode  Cnt   Score   Error   Units
BenchmarkReadUleb128Long.readUleb128Long        1000  thrpt   30  56.499 ± 0.621  ops/ms
BenchmarkReadUleb128Long.readUleb128Long       10000  thrpt   30   4.820 ± 0.260  ops/ms
BenchmarkReadUleb128Long.readUleb128LongLoop    1000  thrpt   30  38.991 ± 2.311  ops/ms
BenchmarkReadUleb128Long.readUleb128LongLoop   10000  thrpt   30   3.380 ± 0.278  ops/ms

Co-authored-by: Raunaq Morarka <raunaqmorarka@gmail.com>
Benchmark                       (bitWidth)   Mode  Cnt  Before            After             Units
BenchmarkByteColumnReader.read           0  thrpt   30  171.680 ± 12.559  748.504 ±  6.711  ops/s
BenchmarkByteColumnReader.read           1  thrpt   30  193.714 ±  0.804  593.429 ± 28.570  ops/s
BenchmarkByteColumnReader.read           2  thrpt   30  182.886 ±  3.220  624.125 ± 56.425  ops/s
BenchmarkByteColumnReader.read           3  thrpt   30  185.313 ±  2.019  570.076 ± 41.381  ops/s
BenchmarkByteColumnReader.read           4  thrpt   30  175.380 ±  2.142  581.016 ± 28.967  ops/s
BenchmarkByteColumnReader.read           5  thrpt   30  172.296 ±  2.929  572.173 ± 31.615  ops/s
BenchmarkByteColumnReader.read           6  thrpt   30  168.721 ±  0.843  551.679 ± 35.310  ops/s
BenchmarkByteColumnReader.read           7  thrpt   30  180.839 ±  3.180  823.503 ± 15.124  ops/s
BenchmarkByteColumnReader.read           8  thrpt   30  162.636 ±  2.574  523.664 ± 31.919  ops/s

Co-authored-by: Raunaq Morarka <raunaqmorarka@gmail.com>
Benchmark                        (bitWidth)   Mode  Cnt  Before            After             Units
BenchmarkShortColumnReader.read           0  thrpt   30  201.502 ± 14.020  976.715 ± 28.892  ops/s
BenchmarkShortColumnReader.read           1  thrpt   30  173.992 ± 10.272  624.690 ± 37.268  ops/s
BenchmarkShortColumnReader.read           2  thrpt   30  168.042 ±  5.116  556.310 ± 52.423  ops/s
BenchmarkShortColumnReader.read           3  thrpt   30  174.832 ±  4.403  577.811 ± 27.119  ops/s
BenchmarkShortColumnReader.read           4  thrpt   30  172.531 ±  3.270  582.771 ± 38.262  ops/s
BenchmarkShortColumnReader.read           8  thrpt   30  145.744 ± 12.112  490.298 ± 43.535  ops/s
BenchmarkShortColumnReader.read          10  thrpt   30  152.312 ±  3.486  506.218 ±  9.371  ops/s
BenchmarkShortColumnReader.read          11  thrpt   30  153.093 ±  5.410  503.974 ± 12.769  ops/s
BenchmarkShortColumnReader.read          14  thrpt   30  138.288 ±  5.873  438.434 ± 27.987  ops/s
BenchmarkShortColumnReader.read          16  thrpt   30  147.998 ±  1.930  410.992 ± 31.457  ops/s

Co-authored-by: Raunaq Morarka <raunaqmorarka@gmail.com>
Benchmark                                       (encoding)   Mode  Cnt  Before            After            Units
BenchmarkInt32ToLongColumnReader.read                PLAIN  thrpt   20  399.059 ± 34.474  504.970 ± 8.555  ops/s
BenchmarkInt32ToLongColumnReader.read  DELTA_BINARY_PACKED  thrpt   20  113.250 ±  4.856  339.272 ± 3.497  ops/s
@raunaqmorarka raunaqmorarka changed the title Optimize decoders for integers in DELTA_BINARY_PACKED parquet encoding Optimize decoders for numeric types in DELTA_BINARY_PACKED parquet encoding Jan 30, 2023
@raunaqmorarka raunaqmorarka changed the title Optimize decoders for numeric types in DELTA_BINARY_PACKED parquet encoding Optimize decoders for DELTA_BINARY_PACKED parquet encoding Jan 30, 2023
@raunaqmorarka raunaqmorarka merged commit 0dffb4d into trinodb:master Jan 30, 2023
@raunaqmorarka raunaqmorarka deleted the pqr-v2-int branch January 30, 2023 11:31
@github-actions github-actions bot added this to the 407 milestone Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

3 participants