Refine Index Join #8470

zz-jason · 2018-11-27T09:50:18Z

Feature Request

At present, the Index Join implementation is not efficient at some scenarios:

It may cause TiDB OOM because it uses the inner table to construct the hash table
It can not response to the parent in a short period, because it has to wait to all the inner rows matched the outer join key to be fetched out from TiKV and have build hash table on it, and do the join operation on the main thread.
The execution is not efficient, because all the join work are performed in the main thread, the outer and inner workers are only responsible for fetching data from TiKV

Describe the feature you'd like:

Split Index Join into two operators:

One for keep order. In this operator, the output of the Index Join should be ordered by the outer join key. We can do a Merge Join on a task
One for no need to keep order. In this operator, the output of the Index Join can have arbitrary order. In order to limit the memory consumption, we can use the outer rows inside a task to build the hash table and do hash join on the fetched inner rows, return a Chunk as soon as possible.

Describe alternatives you've considered:

No

Teachability, Documentation, Adoption, Migration Strategy:

After discussing offline, @yu34po will work on this issue.

yu34po · 2018-12-17T12:55:44Z

will fix the index join featuer in 3 PRs
1.add indexhashjoin for non order by query, Reserved old inderlookupjoin for order by query
#8661
2.add new joiner to fix column order with trytomatch/onmissmatch problems of indexhashjoin
3.add indexmergejoin to slove order by situation

XuHuaiyu · 2019-09-03T02:59:25Z

index hash join:

support index hash join in the execution engine basically executor: support index nested loop hash join #8661
add interface tryToMatchOuters in joiner.go executor, expression: add a tryToMatchOuters for joiner #11922
support individual physical plan for index hash join
support new cost model for index hash join

index merge join:

support new cost model and individual physical plan for index merge join planner: support index_lookup_merge_join in physical plan. #11338
support index merge join in the execution engine executor: support index lookup merge join in executor. #12024

zz-jason added type/enhancement sig/execution SIG execution labels Nov 27, 2018

morgo mentioned this issue Dec 4, 2018

Optimizer selects merge join over nested loop join incorrectly #8563

Closed

yu34po mentioned this issue Dec 12, 2018

executor: support index nested loop hash join #8661

Merged

yu34po mentioned this issue Feb 12, 2019

executor: add tryToMatchOuters to the joiner interface #9286

Closed

lzmhhh123 mentioned this issue Mar 6, 2019

executor: support index_look_up_merge_join to speed up index_look_up_join #9571

Closed

XuHuaiyu assigned XuHuaiyu and lzmhhh123 Sep 3, 2019

SunRunAway closed this as completed Nov 20, 2019

scsldb added this to the v5.0.0-rc milestone Dec 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine Index Join #8470

Refine Index Join #8470

zz-jason commented Nov 27, 2018

yu34po commented Dec 17, 2018

XuHuaiyu commented Sep 3, 2019 •

edited by SunRunAway

Refine Index Join #8470

Refine Index Join #8470

Comments

zz-jason commented Nov 27, 2018

Feature Request

yu34po commented Dec 17, 2018

XuHuaiyu commented Sep 3, 2019 • edited by SunRunAway

XuHuaiyu commented Sep 3, 2019 •

edited by SunRunAway