Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-3432][VL] Add support for BroadcastNestedLoopJoinExec #4565

Merged
merged 19 commits into from
Feb 22, 2024

Conversation

Surbhi-Vijay
Copy link
Contributor

@Surbhi-Vijay Surbhi-Vijay commented Jan 29, 2024

What changes were proposed in this pull request?

Added support for BroadcastNestedLoopJoinTransformer

Supported join types => Inner, Cross, LeftOuter with BuildRight, RightOuter with BuildLeft

(Fixes: #3432)

How was this patch tested?

Tests in GlutenInnerJoin, GlutenOuterJoin and Various other tests already covers it.

Follow Up:

  1. Update documentation

Copy link

#3432

Copy link

Run Gluten Clickhouse CI

@Surbhi-Vijay Surbhi-Vijay changed the title [GLUTEN-3432][VL]Add support for BroadcastNestedLoopJoinTransformer [GLUTEN-3432][VL]Add support for BroadcastNestedLoopJoinExec Jan 29, 2024
Copy link

Run Gluten Clickhouse CI

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Surbhi-Vijay for working on this.

And broadcast's validation is now tricky. I'd suggest doing the work based on this ongoing patch.

@zhztheplayer zhztheplayer changed the title [GLUTEN-3432][VL]Add support for BroadcastNestedLoopJoinExec [GLUTEN-3432][VL] Add support for BroadcastNestedLoopJoinExec Jan 30, 2024
@Surbhi-Vijay
Copy link
Contributor Author

Surbhi-Vijay commented Jan 30, 2024

And broadcast's validation is now tricky. I'd suggest doing the work based on this ongoing patch.

Thanks @zhztheplayer for mentioning this. I will review this patch and change the current PR accordingly.

@zhztheplayer
Copy link
Member

@Surbhi-Vijay #4544 was merged. Do you have a plan to continue this work? Thanks.

@Surbhi-Vijay
Copy link
Contributor Author

@Surbhi-Vijay #4544 was merged. Do you have a plan to continue this work? Thanks.

Yes @zhztheplayer, I am working on it and will publish it soon in the current week.

Copy link

github-actions bot commented Feb 6, 2024

Run Gluten Clickhouse CI

@Surbhi-Vijay Surbhi-Vijay marked this pull request as ready for review February 6, 2024 15:30
*/
while (rowIterator.hasNext) {
val unsafeRow = rowIterator.next().asInstanceOf[UnsafeRow]
rowArray.append(unsafeRow.copy())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhztheplayer can you please review this change and suggest if there is any better way to resolve it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array[InternalRow] requires all rows to be materialized so I think it's fine to use .copy() in the initial version of this feature. We can still find a way to optimize the code later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thank you for adding the detailed explanation in comment :)

Copy link

github-actions bot commented Feb 6, 2024

Run Gluten Clickhouse CI

2 similar comments
Copy link

github-actions bot commented Feb 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Feb 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Feb 7, 2024

Run Gluten Clickhouse CI

2 similar comments
Copy link

github-actions bot commented Feb 7, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Feb 7, 2024

Run Gluten Clickhouse CI

@Surbhi-Vijay Surbhi-Vijay marked this pull request as draft February 8, 2024 13:48
@Surbhi-Vijay
Copy link
Contributor Author

I will fix the testcases and publish again.

Copy link

github-actions bot commented Feb 9, 2024

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

2 similar comments
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@Surbhi-Vijay Surbhi-Vijay marked this pull request as ready for review February 14, 2024 05:10
Copy link

Run Gluten Clickhouse CI

Comment on lines +148 to +162
assertJoinBuildSide("SELECT /*+ MAPJOIN(t1, t2) */ * FROM t1 JOIN t2", blt, BuildLeft)
// FULL JOIN && t1Size < t2Size => BuildLeft
assertJoinBuildSide("SELECT /*+ MAPJOIN(t1, t2) */ * FROM t1 FULL JOIN t2", bl, BuildLeft)
// FULL OUTER && t1Size < t2Size => BuildLeft
assertJoinBuildSide("SELECT * FROM t1 FULL OUTER JOIN t2", bl, BuildLeft)
// LEFT JOIN => BuildRight
assertJoinBuildSide("SELECT /*+ MAPJOIN(t1, t2) */ * FROM t1 LEFT JOIN t2", bl, BuildRight)
assertJoinBuildSide("SELECT /*+ MAPJOIN(t1, t2) */ * FROM t1 LEFT JOIN t2", blt, BuildRight)
// RIGHT JOIN => BuildLeft
assertJoinBuildSide("SELECT /*+ MAPJOIN(t1, t2) */ * FROM t1 RIGHT JOIN t2", bl, BuildLeft)
assertJoinBuildSide("SELECT /*+ MAPJOIN(t1, t2) */ * FROM t1 RIGHT JOIN t2", blt, BuildLeft)

/* #### test with broadcast hint #### */
// INNER JOIN && broadcast(t1) => BuildLeft
assertJoinBuildSide("SELECT /*+ MAPJOIN(t1) */ * FROM t1 JOIN t2", bl, BuildLeft)
assertJoinBuildSide("SELECT /*+ MAPJOIN(t1) */ * FROM t1 JOIN t2", blt, BuildLeft)
// INNER JOIN && broadcast(t2) => BuildRight
assertJoinBuildSide("SELECT /*+ MAPJOIN(t2) */ * FROM t1 JOIN t2", bl, BuildRight)
assertJoinBuildSide("SELECT /*+ MAPJOIN(t2) */ * FROM t1 JOIN t2", blt, BuildRight)
Copy link
Member

@zhztheplayer zhztheplayer Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the statements check on physical plan only? Do we need test with result-checking to cover the added code?

I see we had enabled Gluten...JoinSuites but I am not sure if the test cases work for Gluten's BNLJ either. Would you like to help check that? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GlutenInnerJoinSuite and GlutenOuterJoinSuite are enabled in the Gluten. Both of these suites check the answer of test cases. In these test cases, explicit joins are selected and since now we have added BNLJTransformer, BNLJ is getting transformed into Gluten for inner, leftOuter with BuildRight and RightOuter with BuildLeft joins.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here in GlutenBroadcastJoinSuite, it is not checking the answer and only checking physical plan. This test case is a rewrite of spark test case in Gluten and I had to make changes in to get it passed in pipeline.

please let me know if I should add more test cases.

The few testcases I can think of is to add fallback test cases for other remaining join types i.e. FullOuter, LeftOuter with BuildLeft & RightOuter with BuildRight.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your explanation. If GlutenBroadcastJoinSuite mainly checks against plan then we don't have to add answer-checking code in this suite. I was just worried whether we actually have at least some code to verify the correctness of the patch's code. If GlutenInnerJoinSuite GlutenOuterJoinSuite do work for BNLJ transformer, then it's fine enough to me.

The few testcases I can think of is to add fallback test cases for other remaining join types i.e. FullOuter, LeftOuter with BuildLeft & RightOuter with BuildRight.

That's great idea. You can decide whether to put code in this PR or in a new one. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhztheplayer I will create a follow up PR covering below:

  • Updating documentation
  • Adding testcases for fallback scenarios
  • Renaming broadcast API methods to generic names for readability.

val broadcastRDD = {
val executionId = sparkContext.getLocalProperty(SQLExecution.EXECUTION_ID_KEY)
BackendsApiManager.getBroadcastApiInstance
.collectExecutionBroadcastHashTableId(executionId, buildTableId)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhztheplayer can I rename methods under BroadcastApi to make it more generic? Since now this is being used for both the modes i.e. hashedBroadcastMode & IdentityBroadcastMode. These method names containing Hashtable might confuse other developers who may read this file later.

Rename collectExecutionBroadcastHashTableId => collectExecutionBroadcastTableId
Rename cleanExecutionBroadcastHashtable=> cleanExecutionBroadcastTable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename collectExecutionBroadcastHashTableId => collectExecutionBroadcastTableId
Rename cleanExecutionBroadcastHashtable=> cleanExecutionBroadcastTable

Sounds fair to me. Thanks.

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@Surbhi-Vijay
Copy link
Contributor Author

@zhztheplayer @zhouyuan I have rebased this PR again, can you please review and provide feedback if there are any?

I will create follow up PR for remaining small task items like documentation, renaming broadcast api common methods for clarity etc.

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Surbhi-Vijay
Copy link
Contributor Author

@zhouyuan @zhztheplayer Can we please merge this PR?

@zhli1142015 zhli1142015 merged commit a1f83cd into apache:main Feb 22, 2024
19 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4565_time.csv log/native_master_02_22_2024_996ff4c89_time.csv difference percentage
q1 31.95 32.51 0.564 101.77%
q2 26.39 24.28 -2.118 91.97%
q3 37.71 38.12 0.413 101.10%
q4 39.04 38.44 -0.606 98.45%
q5 71.21 71.06 -0.153 99.79%
q6 5.50 5.36 -0.145 97.37%
q7 83.91 85.47 1.557 101.86%
q8 87.45 88.36 0.913 101.04%
q9 124.37 123.63 -0.741 99.40%
q10 43.09 44.24 1.156 102.68%
q11 20.63 21.09 0.465 102.26%
q12 27.53 26.06 -1.467 94.67%
q13 45.27 46.65 1.379 103.05%
q14 20.99 15.31 -5.688 72.90%
q15 29.96 31.22 1.259 104.20%
q16 14.51 13.80 -0.706 95.14%
q17 99.79 104.44 4.643 104.65%
q18 147.94 149.48 1.549 101.05%
q19 12.64 13.72 1.085 108.59%
q20 27.85 28.41 0.552 101.98%
q21 222.96 226.51 3.553 101.59%
q22 13.81 13.62 -0.191 98.61%
total 1234.51 1241.78 7.274 100.59%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[VL] Support BroadcastNestedLoopJoinExec and CartesianProductExec
7 participants