New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-12936][table-planner-blink] Support intersect all / minus all to blink planner #8898
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this PR @JingsongLi . please add rule test for RewriteMinusAll
and RewriteIntersectAll
, just as ReplaceIntersectWithSemiJoinRuleTest
and ReplaceMinusWithAntiJoinRule
do.
...anner-blink/src/test/scala/org/apache/flink/table/runtime/batch/sql/SetOperatorsITCase.scala
Show resolved
Hide resolved
.../src/main/scala/org/apache/flink/table/plan/rules/logical/ReplaceSetOpWithJoinRuleBase.scala
Outdated
Show resolved
Hide resolved
.../src/main/scala/org/apache/flink/table/plan/rules/logical/ReplaceSetOpWithJoinRuleBase.scala
Outdated
Show resolved
Hide resolved
.../src/main/scala/org/apache/flink/table/plan/rules/logical/ReplaceSetOpWithJoinRuleBase.scala
Outdated
Show resolved
Hide resolved
.../src/main/scala/org/apache/flink/table/plan/rules/logical/ReplaceSetOpWithJoinRuleBase.scala
Outdated
Show resolved
Hide resolved
...ner-blink/src/main/scala/org/apache/flink/table/plan/rules/logical/RewriteIntersectAll.scala
Outdated
Show resolved
Hide resolved
...planner-blink/src/main/scala/org/apache/flink/table/plan/rules/logical/RewriteMinusAll.scala
Outdated
Show resolved
Hide resolved
...planner-blink/src/main/scala/org/apache/flink/table/plan/rules/logical/RewriteMinusAll.scala
Outdated
Show resolved
Hide resolved
...ner-blink/src/main/scala/org/apache/flink/table/plan/rules/logical/RewriteIntersectAll.scala
Outdated
Show resolved
Hide resolved
]]> | ||
</Resource> | ||
</TestCase> | ||
<TestCase name="testIntersectAll"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this case should be in second commit ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should just squash second and third to a one commit.
update commit message as |
OK, and I will change the names to |
625cba0
to
b186176
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i left some minor comments
"ReplaceIntersectWithSemiJoinRule") { | ||
|
||
override def matches(call: RelOptRuleCall): Boolean = { | ||
val intersect: Intersect = call.rel(0) | ||
// not support intersect all now. | ||
intersect.isDistinct | ||
intersect.isDistinct && intersect.getInputs.size() == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intersect.isDistinct => !intersect.all
and add a TODO remove "intersect.getInputs.size() == 2" limit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intersect.isDistinct => !intersect.all
Why?
and add a TODO remove "intersect.getInputs.size() == 2" limit ?
I think this rule should never deal with size bigger than 2, we should introduce a new rule to split it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, Use all SetOp.all
and I will add comment to explain these rules just handle the case of input size 2
"ReplaceMinusWithAntiJoinRule") { | ||
|
||
override def matches(call: RelOptRuleCall): Boolean = { | ||
val minus: Minus = call.rel(0) | ||
// not support minus all now. | ||
minus.isDistinct | ||
minus.isDistinct && minus.getInputs.size() == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minus.isDistinct => !minus.all
add a TODO
|
||
override def matches(call: RelOptRuleCall): Boolean = { | ||
val intersect: Intersect = call.rel(0) | ||
intersect.all && intersect.getInputs.size() == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a TODO
val leftWithAddedVirtualCols = leftBuilder | ||
.push(left) | ||
.project(leftBuilder.fields(fields) ++ | ||
Seq(leftBuilder.alias(leftBuilder.cast(leftBuilder.literal(1L), BIGINT), "vcol"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please give a more meaningful name instead of vcol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name it vcol_marker
, there is no appropriate name to describe it.
LGTM, +1 to merge |
…l to blink planner
Travis passed(blink planner) in https://travis-ci.org/JingsongLi/flink/builds/554510279 |
What is the purpose of the change
Now, we just support intersect and minus, See ReplaceIntersectWithSemiJoinRule and ReplaceMinusWithAntiJoinRule, replace intersect with null aware semi-join and distinct aggregate.
We need support intersect all and minus all too.
Presto and Spark already support them:
prestodb/presto#4918
https://issues.apache.org/jira/browse/SPARK-21274
I think them have a good rewrite design and we can follow them:
1.For intersect all
Input Query
Rewritten Query
2.For minus all:
Input Query
Rewritten Query
Verifying this change
ut
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation