Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance: Avoid assign too much segment/channels to new querynode (#34096) #34245

Merged
merged 1 commit into from
Jul 1, 2024

Conversation

weiliu1031
Copy link
Contributor

issue: #34095
pr: #34096

When a new query node comes online, the segment_checker, channel_checker, and balance_checker simultaneously attempt to allocate segments to it. If this occurs during the execution of a load task and the distribution of the new query node hasn't been updated, the query coordinator may mistakenly view the new query node as empty. As a result, it assigns segments or channels to it, potentially overloading the new query node with more segments or channels than expected.

This PR measures the workload of the executing tasks on the target query node to prevent assigning an excessive number of segments to it.


…vus-io#34096)

issue: milvus-io#34095

When a new query node comes online, the segment_checker,
channel_checker, and balance_checker simultaneously attempt to allocate
segments to it. If this occurs during the execution of a load task and
the distribution of the new query node hasn't been updated, the query
coordinator may mistakenly view the new query node as empty. As a
result, it assigns segments or channels to it, potentially overloading
the new query node with more segments or channels than expected.

This PR measures the workload of the executing tasks on the target query
node to prevent assigning an excessive number of segments to it.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
@sre-ci-robot sre-ci-robot added the size/L Denotes a PR that changes 100-499 lines. label Jun 27, 2024
@mergify mergify bot added dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement labels Jun 27, 2024
Copy link
Contributor

mergify bot commented Jun 27, 2024

@weiliu1031 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link

codecov bot commented Jun 27, 2024

Codecov Report

Attention: Patch coverage is 96.55172% with 2 lines in your changes missing coverage. Please review.

Project coverage is 81.70%. Comparing base (1c6e850) to head (ffc3cc3).
Report is 7 commits behind head on 2.4.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##              2.4   #34245      +/-   ##
==========================================
- Coverage   85.61%   81.70%   -3.92%     
==========================================
  Files         761     1015     +254     
  Lines      107715   130465   +22750     
==========================================
+ Hits        92222   106590   +14368     
- Misses      11490    19873    +8383     
+ Partials     4003     4002       -1     
Files Coverage Δ
internal/querycoordv2/balance/balance.go 96.96% <100.00%> (ø)
...al/querycoordv2/balance/rowcount_based_balancer.go 97.63% <100.00%> (+0.03%) ⬆️
...ernal/querycoordv2/balance/score_based_balancer.go 100.00% <100.00%> (+1.82%) ⬆️
internal/querycoordv2/task/scheduler.go 88.42% <95.65%> (+4.10%) ⬆️

... and 277 files with indirect coverage changes

@weiliu1031
Copy link
Contributor Author

/run-cpu-e2e

Copy link
Contributor

mergify bot commented Jun 27, 2024

@weiliu1031 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@mergify mergify bot added the ci-passed label Jun 28, 2024
Copy link
Contributor

@XuanYang-cn XuanYang-cn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

@congqixia congqixia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: congqixia, weiliu1031, XuanYang-cn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit b18de95 into milvus-io:2.4 Jul 1, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved ci-passed dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement lgtm size/L Denotes a PR that changes 100-499 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants