-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix predicate pushdown for Parquet decimal columns #9338
Fix predicate pushdown for Parquet decimal columns #9338
Conversation
|
Can we separate that last commit from the previous ones? It seems like this is an independent issue and we should be able to fix it without waiting for the other commits. |
16af5e6
to
5826ea5
Compare
5826ea5
to
578b680
Compare
lib/trino-parquet/src/test/java/io/trino/parquet/TestTupleDomainParquetPredicate.java
Show resolved
Hide resolved
578b680
to
ca19425
Compare
plugin/trino-hive/src/test/java/io/trino/plugin/hive/TestHiveConnectorTest.java
Show resolved
Hide resolved
ca19425
to
b4620bf
Compare
Slice zero = encodeScaledValue(new BigDecimal("0"), type.getScale()); | ||
Slice hundred = encodeScaledValue(new BigDecimal("100"), type.getScale()); | ||
Slice zero = unscaledDecimal(BigInteger.valueOf(0L)); | ||
Slice hundred = unscaledDecimal(BigInteger.valueOf(100L)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expectations here didn't match the code. The encoding is done here using unscaledDecimal
https://github.com/trinodb/trino/blob/master/lib/trino-parquet/src/main/java/io/trino/parquet/ParquetTypeUtils.java#L264
b4620bf
to
8e5ef18
Compare
Updated to include tests for long decimals as well |
this I do not understand i know the code was not correct, but i don't see yet why it was incorrect in every case. other than that -- LGTM |
every case was a bit of an overstatement. I updated the PR description to be clearer. |
First four commits are from #9326The overflow detection constructs a BigDecimal which was missing the decimal scale argument. This resulted in the statistics incorrectly detecting an overflow in many cases.
Example:
Decimal(5, 3) has a maximum value of 99.999. Before the fix, any value over 0.099 would result in an overflow. 0.099 would be represented as 99, which is still lower than the max value, but 0.100 is represented as 100, triggering the overflow logic.