You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At present, the Index Join implementation is not efficient at some scenarios:
It may cause TiDB OOM because it uses the inner table to construct the hash table
It can not response to the parent in a short period, because it has to wait to all the inner rows matched the outer join key to be fetched out from TiKV and have build hash table on it, and do the join operation on the main thread.
The execution is not efficient, because all the join work are performed in the main thread, the outer and inner workers are only responsible for fetching data from TiKV
Describe the feature you'd like:
Split Index Join into two operators:
One for keep order. In this operator, the output of the Index Join should be ordered by the outer join key. We can do a Merge Join on a task
One for no need to keep order. In this operator, the output of the Index Join can have arbitrary order. In order to limit the memory consumption, we can use the outer rows inside a task to build the hash table and do hash join on the fetched inner rows, return a Chunk as soon as possible.
will fix the index join featuer in 3 PRs
1.add indexhashjoin for non order by query, Reserved old inderlookupjoin for order by query #8661
2.add new joiner to fix column order with trytomatch/onmissmatch problems of indexhashjoin
3.add indexmergejoin to slove order by situation
Feature Request
At present, the Index Join implementation is not efficient at some scenarios:
Describe the feature you'd like:
Split Index Join into two operators:
Describe alternatives you've considered:
No
Teachability, Documentation, Adoption, Migration Strategy:
After discussing offline, @yu34po will work on this issue.
The text was updated successfully, but these errors were encountered: