Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg: make speed estimation more accurate #4963

Merged
merged 3 commits into from May 17, 2022
Merged

Conversation

rleungx
Copy link
Member

@rleungx rleungx commented May 17, 2022

Signed-off-by: Ryan Leung rleungx@gmail.com

What problem does this PR solve?

Issue Number: Ref #4640.

What is changed and how does it work?

This PR changes the way of calculating progress speed. After this PR, we use a sliding window to get the speed through the delta in a fixed time interval.

Now, the monitor becomes:
Screen Shot 2022-05-17 at 11 37 45 AM

Check List

Tests

  • Unit test

Release note

None

Signed-off-by: Ryan Leung <rleungx@gmail.com>
@ti-chi-bot
Copy link
Member

ti-chi-bot commented May 17, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • AndreMouche
  • CabinfeverB

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

p.lastTimeRemaining = remaining

// We use window size / update interval to get the history length.
historyLen := int(speedStatisticalWindow / updateInterval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we set historyLen as a constant(an attribute for Manager) once Manager created?

// We use window size / update interval to get the history length.
historyLen := int(speedStatisticalWindow / updateInterval)
if len(p.history) > historyLen {
p.history = p.history[1:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this cause a GC problem? It seems once meet the historyLen, this operator will be done as frequently as the UpdateProgress called? How about use a slice with constant length to implement the FIFOQueue? For example, we assume the historyLength keep 5, and there is a slice::
1,2,3,4,5 (first=0,last:4)
when 6 comes, the slice changed to
6,2,3,4,5 (first=1,last=0)
and 7 comes:
6,7,3,4,5 (first=2,last=1)

Signed-off-by: Ryan Leung <rleungx@gmail.com>
@codecov
Copy link

codecov bot commented May 17, 2022

Codecov Report

Merging #4963 (c6bb46f) into master (562586c) will decrease coverage by 0.19%.
The diff coverage is 91.66%.

@@            Coverage Diff             @@
##           master    #4963      +/-   ##
==========================================
- Coverage   75.45%   75.25%   -0.20%     
==========================================
  Files         298      298              
  Lines       29532    29542      +10     
==========================================
- Hits        22282    22231      -51     
- Misses       5318     5362      +44     
- Partials     1932     1949      +17     
Flag Coverage Δ
unittests 75.25% <91.66%> (-0.20%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/progress/progress.go 92.75% <85.71%> (-4.13%) ⬇️
server/cluster/cluster.go 83.70% <100.00%> (-0.98%) ⬇️
server/cluster/metrics.go 100.00% <100.00%> (ø)
server/schedulers/shuffle_hot_region.go 55.55% <0.00%> (-10.11%) ⬇️
pkg/etcdutil/etcdutil.go 85.05% <0.00%> (-3.45%) ⬇️
server/tso/tso.go 66.66% <0.00%> (-3.39%) ⬇️
server/member/member.go 64.21% <0.00%> (-3.16%) ⬇️
server/election/leadership.go 75.25% <0.00%> (-2.07%) ⬇️
server/cluster/coordinator.go 72.01% <0.00%> (-1.65%) ⬇️
server/schedulers/utils.go 92.53% <0.00%> (-1.50%) ⬇️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 562586c...c6bb46f. Read the comment docs.

if p.lastTimeRemaining < remaining {
p.lastTimeRemaining = remaining

if p.history.Len() > p.historyLen {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if p.history.Len() > p.historyLen {
if p.history.Len() >= p.historyLen {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on purpose because we will put one record when initialization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the actually length limit of p.history is historyLen+1?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the actually length limit of p.history is historyLen+1?

IMO, historyLen is used for time window. For example, if updateInterval is equal to one minute, then historyLen is equal to 10, so it takes 11 points

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change it to windowLengthLimit

@rleungx rleungx requested a review from AndreMouche May 17, 2022 06:14
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Copy link
Member

@AndreMouche AndreMouche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 17, 2022
@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 17, 2022
@nolouch
Copy link
Contributor

nolouch commented May 17, 2022

/merge

@ti-chi-bot
Copy link
Member

@nolouch: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: c6bb46f

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label May 17, 2022
@nolouch
Copy link
Contributor

nolouch commented May 17, 2022

/merge

@ti-chi-bot
Copy link
Member

@nolouch: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot merged commit 7088c95 into tikv:master May 17, 2022
@rleungx rleungx deleted the add-speed branch May 17, 2022 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants