Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: duplicate output for HashJoinExec in CollectLeft mode #9757

Merged
merged 3 commits into from
Apr 21, 2024

Conversation

korowa
Copy link
Contributor

@korowa korowa commented Mar 23, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

While #9676, found that at this moment HashJoinExec works incorrect in case of LEFT / FULL types. Currently, construction of these joins is prevented by physical optimizer join_selection rule. This PR adds support for all join types in CollectLeft mode.

What changes are included in this PR?

  • JoinLeftData now contains bitmap with visited left indices and atomic counter of total right-side threads (streams), updating this particular JoinLeftData object (initial value will always be 1 for partitioned joins, and "number of right-side partitions" for CollectLeft)
  • In the beginning of process_unmatched_build_batch, when it's guaranteed that there won't be any further updates of visited indices bitmap, each HashJoinStream decrements the counter of running partitions by calling report_probe_completed, and only the last caller-thread (determined by counter value) will return unmatched left-side data
  • Conditional logic for join types removed from try_collect_left in join_selection rule

Are these changes tested?

Yes, by adding test case for parallel (multiple right-side partitions) hash join execution with CollectLeft partition mode

Are there any user-facing changes?

No

@github-actions github-actions bot added the core Core datafusion crate label Mar 23, 2024
@korowa
Copy link
Contributor Author

korowa commented Mar 23, 2024

/benchmark

Copy link

Benchmark results

Benchmarks comparing 01ff537 (main) and 5f368c6 (PR)
Comparing 01ff537 and 5f368c6
--------------------
Benchmark tpch.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃  01ff537 ┃  5f368c6 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 438.56ms │ 447.33ms │     no change │
│ QQuery 2     │  58.57ms │  59.45ms │     no change │
│ QQuery 3     │ 145.04ms │ 146.76ms │     no change │
│ QQuery 4     │  87.58ms │  85.47ms │     no change │
│ QQuery 5     │ 198.28ms │ 199.41ms │     no change │
│ QQuery 6     │ 105.35ms │ 110.59ms │     no change │
│ QQuery 7     │ 272.45ms │ 275.56ms │     no change │
│ QQuery 8     │ 197.56ms │ 202.01ms │     no change │
│ QQuery 9     │ 288.12ms │ 309.26ms │  1.07x slower │
│ QQuery 10    │ 236.52ms │ 332.32ms │  1.41x slower │
│ QQuery 11    │  62.39ms │  62.91ms │     no change │
│ QQuery 12    │ 125.00ms │ 126.04ms │     no change │
│ QQuery 13    │ 185.85ms │ 178.47ms │     no change │
│ QQuery 14    │ 128.83ms │ 132.23ms │     no change │
│ QQuery 15    │ 187.23ms │ 197.10ms │  1.05x slower │
│ QQuery 16    │  50.04ms │  51.26ms │     no change │
│ QQuery 17    │ 303.46ms │ 303.09ms │     no change │
│ QQuery 18    │ 439.64ms │ 446.52ms │     no change │
│ QQuery 19    │ 230.24ms │ 233.00ms │     no change │
│ QQuery 20    │ 191.89ms │ 188.31ms │     no change │
│ QQuery 21    │ 435.72ms │ 320.12ms │ +1.36x faster │
│ QQuery 22    │  56.25ms │  63.15ms │  1.12x slower │
└──────────────┴──────────┴──────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary      ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (01ff537)   │ 4424.56ms │
│ Total Time (5f368c6)   │ 4470.36ms │
│ Average Time (01ff537) │  201.12ms │
│ Average Time (5f368c6) │  203.20ms │
│ Queries Faster         │         1 │
│ Queries Slower         │         4 │
│ Queries with No Change │        17 │
└────────────────────────┴───────────┘

@alamb
Copy link
Contributor

alamb commented Mar 24, 2024

FYI @gruuya -- it is neat to see the benchmark results appear on PRs ❤️

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @korowa . This looks like a very nice PR

@Dandandan or @metesynnada do you have time to review this PR?

Copy link
Member

@Ted-Jiang Ted-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @korowa 👍 with this pr ballista will avoid shuffle data to disk in some case.

@alamb
Copy link
Contributor

alamb commented Apr 21, 2024

Thanks @korowa and @Ted-Jiang -- I also gave this PR a brief skim and it looks really nice to me.

@alamb alamb merged commit 70db5ea into apache:main Apr 21, 2024
24 checks passed
ccciudatu pushed a commit to hstack/arrow-datafusion that referenced this pull request Apr 26, 2024
* fix: duplicate output for HashJoinExec in CollectLeft mode

* address review comments

* test fix after merging main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core datafusion crate sqllogictest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants