[SPARK-15916][SQL] JDBC filter push down should respect operator precedence #13743

clockfly · 2016-06-17T21:39:16Z

What changes were proposed in this pull request?

This PR fixes the problem that the precedence order is messed when pushing where-clause expression to JDBC layer.

Case 1:

For sql select * from table where (a or b) and c, the where-clause is wrongly converted to JDBC where-clause a or (b and c) after filter push down. The consequence is that JDBC may returns less or more rows than expected.

Case 2:

For sql select * from table where always_false_condition, the result table may not be empty if the JDBC RDD is partitioned using where-clause:

spark.read.jdbc(url, table, predicates = Array("partition 1 where clause", "partition 2 where clause"...)

How was this patch tested?

Unit test.

This PR also close #13640

SparkQA · 2016-06-17T23:29:33Z

Test build #60729 has finished for PR 13743 at commit 2f1ada3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2016-06-18T00:10:45Z

LGTM

liancheng · 2016-06-18T00:11:00Z

Merging to master and branch-2.0.

…edence ## What changes were proposed in this pull request? This PR fixes the problem that the precedence order is messed when pushing where-clause expression to JDBC layer. **Case 1:** For sql `select * from table where (a or b) and c`, the where-clause is wrongly converted to JDBC where-clause `a or (b and c)` after filter push down. The consequence is that JDBC may returns less or more rows than expected. **Case 2:** For sql `select * from table where always_false_condition`, the result table may not be empty if the JDBC RDD is partitioned using where-clause: ``` spark.read.jdbc(url, table, predicates = Array("partition 1 where clause", "partition 2 where clause"...) ``` ## How was this patch tested? Unit test. This PR also close #13640 Author: hyukjinkwon <gurwls223@gmail.com> Author: Sean Zhong <seanzhong@databricks.com> Closes #13743 from clockfly/SPARK-15916. (cherry picked from commit ebb9a3b) Signed-off-by: Cheng Lian <lian@databricks.com>

… precedence apache#13743

HyukjinKwon and others added 2 commits June 17, 2016 14:11

Consider top level and/or precedence for parenthesis

bcfef46

fix partition where clause pushdown

2f1ada3

asfgit closed this in ebb9a3b Jun 18, 2016

zzcclp added a commit to zzcclp/spark that referenced this pull request Jul 27, 2016

[EXT][SPARK-15916][SQL] JDBC filter push down should respect operator…

c2e87cb

… precedence apache#13743

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-15916][SQL] JDBC filter push down should respect operator precedence #13743

[SPARK-15916][SQL] JDBC filter push down should respect operator precedence #13743

clockfly commented Jun 17, 2016 •

edited

SparkQA commented Jun 17, 2016

liancheng commented Jun 18, 2016

liancheng commented Jun 18, 2016

[SPARK-15916][SQL] JDBC filter push down should respect operator precedence #13743

[SPARK-15916][SQL] JDBC filter push down should respect operator precedence #13743

Conversation

clockfly commented Jun 17, 2016 • edited

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Jun 17, 2016

liancheng commented Jun 18, 2016

liancheng commented Jun 18, 2016

clockfly commented Jun 17, 2016 •

edited