-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Java: Fix issue where the genericreader does not propagate case sensitivity #8177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Fixes issue #8178 |
|
I think it would be better to add a UT for this. |
nastra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for fixing this @mderoy. The changes LGTM, but could you please add some unit tests that make sure this gets the expected results for Orc & Parquet?
|
Looks like there are no tests around GenericReader at the moment (unless you can point me to some). closest thing I can find is ./data/src/test/java/org/apache/iceberg/data/orc/TestGenericData.java which uses the GenericOrcReader. I'll look into adding a new file for testing the GenericReader when I have some time :) |
|
I think a good place to add some tests would be |
|
@nastra I added a new case insensitive test which covers both projection and filtering, but let me know if you'd like me to just put a filter scenario in those tests instead :) |
|
There are formatting failures, and after some thought, I think I do prefer moving the caseinsensitive scenarios into the existing test cases for filtering and projection. I'll fix those now. |
… settings add case insensitive unittest
e8e5aef to
adafd72
Compare
|
ready for re-review :) |
nastra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @mderoy
Problem:
Our application is using the following to read from the table
however when I apply a case insensitive filter like
I get the below error despite setting the scan to caseInsensitive
Solution:
The GenericReader is initializing a ParquetReader but not propagating the caseSensitivity argument down to that reader...the same is true for the Orc reader (and avro but I do not see that the avro reader has such an argument). I've tested the fix with Parquet file using our application...I don't have a good way to test with orc.
Testing:
let me know what additional things are needed to land this. I'm assuming that the CI will run all the standard tests...maybe I should add a test for this somewhere?