-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-13237] [SQL] generated broadcast outer join #11130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #50961 has finished for PR 11130 at commit
|
|
Test build #2527 has finished for PR 11130 at commit
|
|
For query: will generate: |
|
Test build #50982 has finished for PR 11130 at commit
|
|
|
||
| // find the matches from HashedRelation | ||
| val matched = ctx.freshName("matched") | ||
| val valid = ctx.freshName("invalid") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps renamed to passesFilters or something like that.
|
I still see initRange() being generated as an empty function. Do we still need this? |
|
@nongli I removed that manually |
|
Test build #2528 has finished for PR 11130 at commit
|
|
Test build #2531 has finished for PR 11130 at commit
|
|
@nongli Could you take another look? |
Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoin.scala sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashOuterJoin.scala sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
|
Test build #51194 has finished for PR 11130 at commit
|
|
cc @rxin |
| hashJoin(streamedIter, hashTable, numOutputRows) | ||
|
|
||
| case LeftOuter => | ||
| streamedIter.flatMap(currentRow => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick
streamedIter.flatMap { currentRow =>
// ...
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are copied and pasted here, usually don't modify them to reduce the budget of review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Thanks for being considerate. Can you just fix them while you are at it now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do it when update this PR.
| | $outputCode | ||
| | } | ||
| """.stripMargin | ||
| |// generate join key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generate join key -> generate join key for stream side
|
Test build #51329 has finished for PR 11130 at commit
|
|
Test build #51333 has finished for PR 11130 at commit
|
|
ping @rxin |
This PR support codegen for broadcast outer join.
In order to reduce the duplicated codes, this PR merge HashJoin and HashOuterJoin together (also BroadcastHashJoin and BroadcastHashOuterJoin).