Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine Index Join #8470

Closed
zz-jason opened this issue Nov 27, 2018 · 2 comments
Closed

Refine Index Join #8470

zz-jason opened this issue Nov 27, 2018 · 2 comments
Assignees
Milestone

Comments

@zz-jason
Copy link
Member

Feature Request

At present, the Index Join implementation is not efficient at some scenarios:

  1. It may cause TiDB OOM because it uses the inner table to construct the hash table
  2. It can not response to the parent in a short period, because it has to wait to all the inner rows matched the outer join key to be fetched out from TiKV and have build hash table on it, and do the join operation on the main thread.
  3. The execution is not efficient, because all the join work are performed in the main thread, the outer and inner workers are only responsible for fetching data from TiKV

Describe the feature you'd like:

Split Index Join into two operators:

  1. One for keep order. In this operator, the output of the Index Join should be ordered by the outer join key. We can do a Merge Join on a task
  2. One for no need to keep order. In this operator, the output of the Index Join can have arbitrary order. In order to limit the memory consumption, we can use the outer rows inside a task to build the hash table and do hash join on the fetched inner rows, return a Chunk as soon as possible.

Describe alternatives you've considered:

No

Teachability, Documentation, Adoption, Migration Strategy:

After discussing offline, @yu34po will work on this issue.

@yu34po
Copy link
Contributor

yu34po commented Dec 17, 2018

will fix the index join featuer in 3 PRs
1.add indexhashjoin for non order by query, Reserved old inderlookupjoin for order by query
#8661
2.add new joiner to fix column order with trytomatch/onmissmatch problems of indexhashjoin
3.add indexmergejoin to slove order by situation

@XuHuaiyu
Copy link
Contributor

XuHuaiyu commented Sep 3, 2019

index hash join:

index merge join:

@scsldb scsldb added this to the v5.0.0-rc milestone Dec 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants