Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raftstore: Add slow log for peer and store msg #16605

Merged
merged 9 commits into from
Mar 12, 2024

Conversation

Connor1996
Copy link
Member

What is changed and how it works?

Issue Number: Ref #16600

What's Changed:

Add slow log for peer and store msg

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Release note

Add slow log for peer and store msg

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
Copy link
Contributor

ti-chi-bot bot commented Mar 5, 2024

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • SpadeA-Tang
  • overvenus

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@@ -710,12 +716,20 @@ where
}
}
self.on_loop_finished();
let elapsed = timer.saturating_elapsed();
slow_log!(
Copy link
Contributor

@LykxSassinator LykxSassinator Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using slow_log is a better choice ? Why not use metrics ?

The default threshold for slow logging is 1s, which is too long to output slow PeerMsg.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to get the duration for each message if using metrics, which may have some overhead in the hot path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me change the slow threshold for this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output of distribution ought to be something like this [x, y, z, ...,], maybe we can wrap it and implements a fmt to make it read friendly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding to metrics if an event exceeds 100ms?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It already has been observed in the below metric

@@ -619,7 +620,12 @@ where
pub fn handle_msgs(&mut self, msgs: &mut Vec<PeerMsg<EK>>) {
let timer = TiInstant::now_coarse();
let count = msgs.len();
let mut distribution = hash_map_with_capacity(std::mem::variant_count::<PeerMsg<EK>>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about putting it to PollContext? So that we can save frequent hashmap creation.
Also it could be an fixed size array as variant_count::<PeerMsg> is a constant number.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a fixed size array instead, as it's should be allocated on stack, I don't put it in PollContext

Comment on lines 840 to 842
pub fn discriminant(&self) -> u8 {
unsafe { *(self as *const Self as *const u8) }
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some comments about how it works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL again

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
components/raftstore/src/store/msg.rs Outdated Show resolved Hide resolved
@@ -710,12 +716,20 @@ where
}
}
self.on_loop_finished();
let elapsed = timer.saturating_elapsed();
slow_log!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding to metrics if an event exceeds 100ms?

@ti-chi-bot ti-chi-bot bot added size/L and removed size/M labels Mar 11, 2024
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@ti-chi-bot ti-chi-bot bot added the status/LGT1 Status: PR - There is already 1 approval label Mar 11, 2024
Copy link
Member

@overvenus overvenus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

components/raftstore/src/store/msg.rs Show resolved Hide resolved
components/raftstore/src/store/msg.rs Show resolved Hide resolved
@ti-chi-bot ti-chi-bot bot added status/LGT2 Status: PR - There are already 2 approvals and removed status/LGT1 Status: PR - There is already 1 approval labels Mar 12, 2024
@Connor1996
Copy link
Member Author

/merge

Copy link
Contributor

ti-chi-bot bot commented Mar 12, 2024

@Connor1996: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Copy link
Contributor

ti-chi-bot bot commented Mar 12, 2024

This pull request has been accepted and is ready to merge.

Commit hash: aaba689

@ti-chi-bot ti-chi-bot bot added the status/can-merge Status: Can merge to base branch label Mar 12, 2024
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@ti-chi-bot ti-chi-bot bot removed the status/can-merge Status: Can merge to base branch label Mar 12, 2024
@Connor1996
Copy link
Member Author

/merge

Copy link
Contributor

ti-chi-bot bot commented Mar 12, 2024

@Connor1996: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Copy link
Contributor

ti-chi-bot bot commented Mar 12, 2024

This pull request has been accepted and is ready to merge.

Commit hash: 2398fee

@ti-chi-bot ti-chi-bot bot added status/can-merge Status: Can merge to base branch and removed do-not-merge/needs-triage-completed labels Mar 12, 2024
@ti-chi-bot ti-chi-bot bot merged commit 8ab7350 into tikv:master Mar 12, 2024
7 checks passed
@ti-chi-bot ti-chi-bot bot added this to the Pool milestone Mar 12, 2024
dbsid pushed a commit to dbsid/tikv that referenced this pull request Mar 24, 2024
ref tikv#16600

Add slow log for peer and store msg

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: dbsid <chenhuansheng@pingcap.com>
ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request Mar 25, 2024
ref tikv#16600

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #16692.

ti-chi-bot bot pushed a commit that referenced this pull request Mar 29, 2024
ref #16600

Add slow log for peer and store msg

Signed-off-by: Connor <zbk602423539@gmail.com>
Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Co-authored-by: Connor <zbk602423539@gmail.com>
Co-authored-by: Connor1996 <zbk602423539@gmail.com>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #16803.

ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request Apr 11, 2024
ref tikv#16600

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
ti-chi-bot bot pushed a commit that referenced this pull request Apr 18, 2024
ref #16600

Add slow log for peer and store msg

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: Qi Xu <tonyxuqqi@outlook.com>

Co-authored-by: Connor <zbk602423539@gmail.com>
Co-authored-by: Qi Xu <tonyxuqqi@outlook.com>
Co-authored-by: tonyxuqqi <tonyxuqi@outlook.com>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request could not be created: failed to create pull request against tikv/tikv#release-7.5 from head ti-chi-bot:cherry-pick-16605-to-release-7.5: the GitHub API request returns a 403 error: {"message":"You have exceeded a secondary rate limit and have been temporarily blocked from content creation. Please retry your request again later. If you reach out to GitHub Support for help, please include the request ID 98EE:21F144:4FEBA32:80A6538:6639C0F0 and timestamp 2024-05-07 05:49:36 UTC.","documentation_url":"https://docs.github.com/rest/overview/rate-limits-for-the-rest-api#about-secondary-rate-limits"}

ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request May 7, 2024
ref tikv#16600

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@wuhuizuo
Copy link
Contributor

/cherry-pick release-7.5

@ti-chi-bot
Copy link
Member

@wuhuizuo: new pull request created to branch release-7.5: #17035.

In response to this:

/cherry-pick release-7.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request May 21, 2024
ref tikv#16600

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
ti-chi-bot bot added a commit that referenced this pull request May 28, 2024
ref #16600

Add slow log for peer and store msg

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Co-authored-by: Connor <zbk602423539@gmail.com>
Co-authored-by: Connor1996 <zbk602423539@gmail.com>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants