-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6647][SQL] Make trait StringComparison as BinaryPredicate and fix unit tests of string data source Filter #5309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…redicate can't translate to data source Filter.
|
Test build #29540 has finished for PR 5309 at commit
|
|
Test build #29543 has finished for PR 5309 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's right to throw exception here, just in case people implement its own expression by inheriting from the Predicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree. It is reasonable to have predicates that can't be pushed down for various reasons. Another issue is this check could often be a runtime error instead of some static compile time check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I am also not very sure about this modification. But seems there is no other proper approach to check possible error in this place.
I will revert this part first.
|
I am not so strong to support that mixin the type For the unit test part of change, can we just keep adding more unit testing, instead of replacing the existed ones? Particularly for those |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this dataType is technically redundant now.
|
I don't feel too strongly about |
|
@chenghao-intel as the comment of
Because it compares two strings and returns a boolean value, it is more like a For the unit tests, the original unit tests are wrong. E.g., the test The original test data for column |
|
Test build #29678 has finished for PR 5309 at commit
|
|
Thanks! Merged to master. |
Now trait
StringComparisonis aBinaryExpression. In fact, it should be aBinaryPredicate.By making
StringComparisonasBinaryPredicate, we can throw error when aexpressions.Predicatecan't translate to a data sourceFilterin functionselectFilters.Without this modification, because we will wrap a
Filteroutside the scanned results inpruneFilterProjectRaw, we can't detect about something is wrong in translating predicates to filters inselectFilters.The unit test of #5285 demonstrates such problem. In that pr, even
expressions.Containsis not properly translated tosources.StringContains, the filtering is still performed by theFilterand so the test passes.Of course, by doing this modification, all
expressions.Predicateclasses need to have its data sourceFiltercorrespondingly.There is a small bug in
FilteredScanSuitefor doingStringEndsWithfilter. This pr also fixes it.