Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12409][SPARK-12387][SPARK-12391][SQL] Support AND/OR/IN/LIKE push-down filters for JDBC #10468

Closed
wants to merge 3 commits into from

Conversation

maropu
Copy link
Member

@maropu maropu commented Dec 24, 2015

This is rework from #10386 and add more tests and LIKE push-down support.

@@ -186,13 +187,19 @@ private[sql] object JDBCRDD extends Logging {
*/
private def compileFilter(f: Filter): String = f match {
case EqualTo(attr, value) => s"$attr = ${compileValue(value)}"
case Not(EqualTo(attr, value)) => s"$attr != ${compileValue(value)}"
case Not(f) => s"NOT (${compileFilter(f)})"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for clarity, it might be better to wrap it in parentheses

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not exactly sure what you point out.
You mean case Not(f) => s"(NOT (${compileFilter(f)}))"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@SparkQA
Copy link

SparkQA commented Dec 24, 2015

Test build #48297 has finished for PR 10468 at commit 7f0a2e6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -186,8 +187,26 @@ class JDBCSuite extends SparkFunSuite
assert(stripSparkFilter(sql("SELECT * FROM foobar WHERE NAME = 'fred'")).collect().size == 1)
assert(stripSparkFilter(sql("SELECT * FROM foobar WHERE NAME > 'fred'")).collect().size == 2)
assert(stripSparkFilter(sql("SELECT * FROM foobar WHERE NAME != 'fred'")).collect().size == 2)
assert(stripSparkFilter(sql("SELECT * FROM foobar WHERE NAME IN ('mary', 'fred')"))
.collect().size == 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we compare using === instead of == here and elsewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong opinion on this.... is it better to fix them? ISTM collection types, e.g., set, need === comparisons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the newer version of scalatest library we use already uses macro so == and === are the same? Can you confirm? Anyway it's not that big of a deal here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, now Assertions trait provides assert macro and the trait mixes in TripleEquals trait so we don't need to change == to ===.

@rxin
Copy link
Contributor

rxin commented Dec 30, 2015

LGTM - maybe we can merge this one once tests pass and then have @viirya rebase his new PR based on this?

@SparkQA
Copy link

SparkQA commented Dec 30, 2015

Test build #48447 has finished for PR 10468 at commit e8dcf0e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Dec 30, 2015

Looks like you also need to update the test case.

@viirya
Copy link
Member

viirya commented Dec 30, 2015

@rxin Good for me.

@maropu
Copy link
Member Author

maropu commented Dec 30, 2015

Okay and I'll fix it in a minute.

@maropu
Copy link
Member Author

maropu commented Dec 30, 2015

@rxin okay and we'll wait for tests done.

@SparkQA
Copy link

SparkQA commented Dec 30, 2015

Test build #48473 has finished for PR 10468 at commit 6e585d6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Dec 30, 2015

I've merged this. Thanks.

@asfgit asfgit closed this in 5c2682b Dec 30, 2015
asfgit pushed a commit that referenced this pull request Dec 31, 2015
asfgit pushed a commit that referenced this pull request Jan 1, 2016
… for JDBCRDD and add few filters

This patch refactors the filter pushdown for JDBCRDD and also adds few filters.

Added filters are basically from #10468 with some refactoring. Test cases are from #10468.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #10470 from viirya/refactor-jdbc-filter.
zzcclp added a commit to zzcclp/spark that referenced this pull request Jul 27, 2016
@maropu maropu deleted the SupportMorePushdownInJdbc branch July 5, 2017 11:48

// This is a test to reflect discussion in SPARK-12218.
// The older versions of spark have this kind of bugs in parquet data source.
val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 'mary')")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two sub-conditions are both ok to be pushed down. So this doesn't actually test against the nested AND issue in SPARK-12218. See #19776

Btw, the two sub-conditions are filtered out the same rows. This doesn't reflect the issue too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, I think ok to drop this test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants