-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-10195] [SQL] Data sources Filter should not expose internal types #8403
[SPARK-10195] [SQL] Data sources Filter should not expose internal types #8403
Conversation
Test build #41482 has finished for PR 8403 at commit
|
Test build #41486 has finished for PR 8403 at commit
|
Test build #41490 has finished for PR 8403 at commit
|
Like the There maybe some cases that user need the internal types in Filter to avoid converions and speed up operations, I think we need to improve our data source API to make this stuff more flexible. |
This is once for query - isn't it? It'd make sense to specialize the input, but I don't think it's worth it for the filter pushdowns. |
This PR LGTM. @cloud-fan Same opinion as @rxin. Filter push-down itself isn't a critical path. |
I've merged this. |
Spark SQL's data sources API exposes Catalyst's internal types through its Filter interfaces. This is a problem because types like UTF8String are not stable developer APIs and should not be exposed to third-parties. This issue caused incompatibilities when upgrading our `spark-redshift` library to work against Spark 1.5.0. To avoid these issues in the future we should only expose public types through these Filter objects. This patch accomplishes this by using CatalystTypeConverters to add the appropriate conversions. Author: Josh Rosen <joshrosen@databricks.com> Closes #8403 from JoshRosen/datasources-internal-vs-external-types. (cherry picked from commit 7bc9a8c) Signed-off-by: Reynold Xin <rxin@databricks.com>
Spark SQL's data sources API exposes Catalyst's internal types through its Filter interfaces. This is a problem because types like UTF8String are not stable developer APIs and should not be exposed to third-parties.
This issue caused incompatibilities when upgrading our
spark-redshift
library to work against Spark 1.5.0. To avoid these issues in the future we should only expose public types through these Filter objects. This patch accomplishes this by using CatalystTypeConverters to add the appropriate conversions.