Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly support Spark Connect filter pushdown #1186

Merged
merged 1 commit into from
Apr 24, 2024

Conversation

phillipleblanc
Copy link
Contributor

Fixes a bug in the Databricks Data Connector where filters were not being properly pushed down, resulting in the equivalent of a SELECT * FROM big_table and then trying to filter in-memory, which was causing some timeouts and inefficient queries.

I attempted to use the in-built DataFusion unparser, but ran into an issue with column quoting and I raised a PR to fix it upstream: apache/datafusion#10198

Once that PR lands and a new version is released, we can start removing our own expr::to_sql method.

@phillipleblanc phillipleblanc added the kind/bug Something isn't working label Apr 23, 2024
@phillipleblanc phillipleblanc added this to the v0.12-alpha milestone Apr 23, 2024
@phillipleblanc phillipleblanc self-assigned this Apr 23, 2024
@phillipleblanc phillipleblanc requested a review from a team as a code owner April 23, 2024 14:57
@ablyler
Copy link

ablyler commented Apr 23, 2024

I've confirmed that is fixes the issue I've seen. Thanks for the quick fix!

@digadeesh digadeesh merged commit 3792065 into trunk Apr 24, 2024
16 checks passed
@digadeesh digadeesh deleted the phillip/240423-spark-push-down branch April 24, 2024 00:20
@digadeesh digadeesh modified the milestones: v0.12-alpha, v0.11.2-alpha Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants