New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-36706][SQL][3.1] OverwriteByExpression conversion in DataSourceV2Strategy use wrong param in translateFilter #33997
Conversation
…rategy use wrong deleteExpr translation
@@ -220,6 +220,33 @@ class DataFrameWriterV2Suite extends QueryTest with SharedSparkSession with Befo | |||
Seq(Row(1L, "a"), Row(2L, "b"), Row(4L, "d"), Row(5L, "e"), Row(6L, "f"))) | |||
} | |||
|
|||
test("Overwrite: overwrite by expression: more than one filters") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add JIRA number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added Jira number and also updated PR description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix looks correct.
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #143280 has finished for PR 33997 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #143284 has finished for PR 33997 at commit
|
@huaxingao Could you also mention the reason why we only need fix it for 3.1 too in the description? Thanks. |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #143287 has finished for PR 33997 at commit
|
@@ -220,6 +220,33 @@ class DataFrameWriterV2Suite extends QueryTest with SharedSparkSession with Befo | |||
Seq(Row(1L, "a"), Row(2L, "b"), Row(4L, "d"), Row(5L, "e"), Row(6L, "f"))) | |||
} | |||
|
|||
test("SPARK-36706 Overwrite: overwrite by expression: more than one filters") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we test that the filters are changed? it seems we only check query answers but I think they should pass no matter with or without the fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the test would pass without the fix. but couldn't figure out an easy way to get the OverwriteByExpressionExec
node and check the filters on it.
Hi, @huaxingao . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @huaxingao , @viirya , @sunchao .
Merged to branch-3.1.
…eV2Strategy use wrong param in translateFilter ### What changes were proposed in this pull request? The wrong parameter is used in `translateFilter` in the following code ``` val filters = splitConjunctivePredicates(deleteExpr).map { filter => DataSourceStrategy.translateFilter(deleteExpr, supportNestedPredicatePushdown = true).getOrElse( throw new AnalysisException(s"Cannot translate expression to source filter: $filter")) }.toArray ``` Using this as an example ``` spark.table("source2_t").writeTo("testcat.table_name").overwrite($"id1" === 3 && $"id2" === 3) ``` The above code will generate these filters: ``` And(EqualTo(id1, 3),EqualTo(id2, 3)) And(EqualTo(id1, 3),EqualTo(id2, 3)) ``` we want to fix the code so it will generate the filters like these: ``` EqualTo(id1, 3) EqualTo(id2, 3) ``` This problem only exists in 3.1. In 3.2 and 3.3, we have ``` val filters = splitConjunctivePredicates(deleteExpr).flatMap { pred => val filter = DataSourceStrategy.translateFilter(pred, supportNestedPredicatePushdown = true) if (filter.isEmpty) { throw QueryCompilationErrors.cannotTranslateExpressionToSourceFilterError(pred) } filter }.toArray ``` ### Why are the changes needed? fix a bug in the code ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests Closes #33997 from huaxingao/spark-36706. Authored-by: Huaxin Gao <huaxin_gao@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Thanks a lot! @dongjoon-hyun @sunchao @viirya |
Kubernetes integration test starting |
Kubernetes integration test status failure |
…eV2Strategy use wrong param in translateFilter ### What changes were proposed in this pull request? The wrong parameter is used in `translateFilter` in the following code ``` val filters = splitConjunctivePredicates(deleteExpr).map { filter => DataSourceStrategy.translateFilter(deleteExpr, supportNestedPredicatePushdown = true).getOrElse( throw new AnalysisException(s"Cannot translate expression to source filter: $filter")) }.toArray ``` Using this as an example ``` spark.table("source2_t").writeTo("testcat.table_name").overwrite($"id1" === 3 && $"id2" === 3) ``` The above code will generate these filters: ``` And(EqualTo(id1, 3),EqualTo(id2, 3)) And(EqualTo(id1, 3),EqualTo(id2, 3)) ``` we want to fix the code so it will generate the filters like these: ``` EqualTo(id1, 3) EqualTo(id2, 3) ``` This problem only exists in 3.1. In 3.2 and 3.3, we have ``` val filters = splitConjunctivePredicates(deleteExpr).flatMap { pred => val filter = DataSourceStrategy.translateFilter(pred, supportNestedPredicatePushdown = true) if (filter.isEmpty) { throw QueryCompilationErrors.cannotTranslateExpressionToSourceFilterError(pred) } filter }.toArray ``` ### Why are the changes needed? fix a bug in the code ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests Closes apache#33997 from huaxingao/spark-36706. Authored-by: Huaxin Gao <huaxin_gao@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
The wrong parameter is used in
translateFilter
in the following codeUsing this as an example
The above code will generate these filters:
we want to fix the code so it will generate the filters like these:
This problem only exists in 3.1. In 3.2 and 3.3, we have
Why are the changes needed?
fix a bug in the code
Does this PR introduce any user-facing change?
no
How was this patch tested?
existing tests