Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38013][SQL][TEST] AQE can change bhj to smj if no extra shuffle introduce #35353

Closed
wants to merge 2 commits into from

Conversation

ulysses-you
Copy link
Contributor

@ulysses-you ulysses-you commented Jan 28, 2022

What changes were proposed in this pull request?

Add a test case in AdaptiveQueryExecSuite.

Why are the changes needed?

AQE can change bhj to smj, and it requires two conditions:

  • no extra shuffle introduce, otherwise the built-in cost evaluator will ban it
  • AQE does not think the join can be planned as broadcast join. That says the cost statistics in normal planner is not accurate.

It's counterintuitive, but it's an expected behavior as AQE designed.

Does this PR introduce any user-facing change?

no

How was this patch tested?

Pass CI

@ulysses-you
Copy link
Contributor Author

cc @yaooqinn @cloud-fan @HyukjinKwon if you have time to take a look

@github-actions github-actions bot added the SQL label Jan 28, 2022
"""
|SELECT * FROM (
| SELECT distinct c1 from t1
| )t1 JOIN (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. Could you add a space? )t1 -> ) t1?

| SELECT distinct c1 from t1
| )t1 JOIN (
| SELECT distinct c1 from t2
| )t2 ON t1.c1 = t2.c1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, do you intentionally reuse t1 for the inner table and outer table? If not, shall we use distinct names?

@ulysses-you
Copy link
Contributor Author

@dongjoon-hyun thank you, addressed comment

dongjoon-hyun pushed a commit that referenced this pull request Feb 2, 2022
…e introduce

### What changes were proposed in this pull request?

Add a test case in `AdaptiveQueryExecSuite`.

### Why are the changes needed?

AQE can change bhj to smj, and it requires two conditions:
- no extra shuffle introduce, otherwise the built-in cost evaluator will ban it
- AQE does not think the join can be planned as broadcast join. That says the cost statistics in normal planner is not accurate.

It's counterintuitive, but it's an expected behavior as AQE designed.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

Pass CI

Closes #35353 from ulysses-you/bhj-smj.

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit dc2fd57)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

Merged to master/3.2

@ulysses-you ulysses-you deleted the bhj-smj branch February 8, 2022 00:41
kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
…e introduce

### What changes were proposed in this pull request?

Add a test case in `AdaptiveQueryExecSuite`.

### Why are the changes needed?

AQE can change bhj to smj, and it requires two conditions:
- no extra shuffle introduce, otherwise the built-in cost evaluator will ban it
- AQE does not think the join can be planned as broadcast join. That says the cost statistics in normal planner is not accurate.

It's counterintuitive, but it's an expected behavior as AQE designed.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

Pass CI

Closes apache#35353 from ulysses-you/bhj-smj.

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit dc2fd57)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 6fc9793)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants