Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45882][SQL][3.4] BroadcastHashJoinExec propagate partitioning should respect CoalescedHashPartitioning #43793

Closed
wants to merge 2 commits into from

Conversation

ulysses-you
Copy link
Contributor

This pr backport #43753 to branch-3.4

What changes were proposed in this pull request?

Add HashPartitioningLike trait and make HashPartitioning and CoalescedHashPartitioning extend it. When we propagate output partiitoning, we should handle HashPartitioningLike instead of HashPartitioning. This pr also changes the BroadcastHashJoinExec to use HashPartitioningLike to avoid regression.

Why are the changes needed?

Avoid unnecessary shuffle exchange.

Does this PR introduce any user-facing change?

yes, avoid regression

How was this patch tested?

add test

Was this patch authored or co-authored using generative AI tooling?

no

…d respect CoalescedHashPartitioning

Add HashPartitioningLike trait and make HashPartitioning and CoalescedHashPartitioning extend it. When we propagate output partiitoning, we should handle HashPartitioningLike instead of HashPartitioning. This pr also changes the BroadcastHashJoinExec to use HashPartitioningLike to avoid regression.

Avoid unnecessary shuffle exchange.

yes, avoid regression

add test

no

Closes apache#43753 from ulysses-you/partitioning.

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@ulysses-you
Copy link
Contributor Author

ulysses-you commented Nov 14, 2023

thanks for review, merging to branch-3.4

ulysses-you added a commit that referenced this pull request Nov 14, 2023
…should respect CoalescedHashPartitioning

This pr backport #43753 to branch-3.4

### What changes were proposed in this pull request?

Add HashPartitioningLike trait and make HashPartitioning and CoalescedHashPartitioning extend it. When we propagate output partiitoning, we should handle HashPartitioningLike instead of HashPartitioning. This pr also changes the BroadcastHashJoinExec to use HashPartitioningLike to avoid regression.

### Why are the changes needed?

Avoid unnecessary shuffle exchange.

### Does this PR introduce _any_ user-facing change?

yes, avoid regression

### How was this patch tested?

add test

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #43793 from ulysses-you/partitioning-3.4.

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: youxiduo <youxiduo@corp.netease.com>
@ulysses-you ulysses-you deleted the partitioning-3.4 branch November 14, 2023 11:57
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Feb 7, 2024
…should respect CoalescedHashPartitioning

This pr backport apache#43753 to branch-3.4

### What changes were proposed in this pull request?

Add HashPartitioningLike trait and make HashPartitioning and CoalescedHashPartitioning extend it. When we propagate output partiitoning, we should handle HashPartitioningLike instead of HashPartitioning. This pr also changes the BroadcastHashJoinExec to use HashPartitioningLike to avoid regression.

### Why are the changes needed?

Avoid unnecessary shuffle exchange.

### Does this PR introduce _any_ user-facing change?

yes, avoid regression

### How was this patch tested?

add test

### Was this patch authored or co-authored using generative AI tooling?

no

Closes apache#43793 from ulysses-you/partitioning-3.4.

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: youxiduo <youxiduo@corp.netease.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants