Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: query optimization: reduce io when retrieve output fields #31822

Closed
1 task done
longjiquan opened this issue Apr 2, 2024 · 1 comment
Closed
1 task done
Assignees
Labels
kind/enhancement Issues or changes related to enhancement stale indicates no udpates for 30 days

Comments

@longjiquan
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

流程图

Above figure illustrates how milvus handles a query request:

    1. Every segments execute the filtering and return all the required output fields;
    1. Shard delegators reduce all the segment results and then return to the Proxy;
    1. Proxy reduce all the delegator results and then return to the users;

We already optimize step 1 & 2 by pushing down the limit operator to the segment, which reduce the sort operation of step 2.
However, step 1 may be also very inefficient. With mmap enabled, retrieving the output fields may cost big io compared to retrieving raw data from memory. However, in the reduce phase, in fact many candidates in the segment results will be discarded, so the prefetched io was wasted. So does the proxy's reduce.

So I want to optimize the query workflow by changing the way of retrieving the required output fields:

    1. Retrieving the output fields after local reduce is done, whether the field enables mmap or not. More cgo calls will be introduced, but much less data will be retrieved. We can consider this optimization will always work since the cost of more cgo calls can be assumed small.
    1. If the limit is very large and any output fields enable the mmap, just same with change 1, the proxy should remove the mmapped fields when forward the query request to the querynode. Retrieval of the mmapped fields should be done after the reduce. Compared to current logic, more rpcs will be introduced and rpc sometimes is also very time-consuming. So trade-off between the rpc-cost and io-cost should be considered. For now, I would suggest that proxy should do the change only when below conditions are all satisfied:
    • Limit is very large, of course this can be configured dynamically, 1000 by default;
    • Output fields contain any mmapped fields;

Why is this needed?

Reduce the io when retrieving output fields and thus increase the qps.

Anything else?

No response

@longjiquan longjiquan added the kind/enhancement Issues or changes related to enhancement label Apr 2, 2024
@longjiquan longjiquan self-assigned this Apr 2, 2024
sre-ci-robot pushed a commit that referenced this issue Apr 25, 2024
issue: #31822

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
Copy link

stale bot commented May 3, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label May 3, 2024
czs007 pushed a commit that referenced this issue May 10, 2024
)

issue: #31822

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
UnyieldingOrca pushed a commit to UnyieldingOrca/milvus that referenced this issue May 10, 2024
@stale stale bot closed this as completed May 12, 2024
sre-ci-robot pushed a commit that referenced this issue May 15, 2024
issue: #31822

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
longjiquan added a commit to longjiquan/milvus that referenced this issue May 23, 2024
issue: milvus-io#31822

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue May 23, 2024
Cherry-pick from master
pr: #32945 
issue: #31822

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues or changes related to enhancement stale indicates no udpates for 30 days
Projects
None yet
Development

No branches or pull requests

1 participant