Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Vrank metrics and logs #1891

Merged
merged 22 commits into from Jul 26, 2023
Merged

Implement Vrank metrics and logs #1891

merged 22 commits into from Jul 26, 2023

Conversation

ian0371
Copy link
Contributor

@ian0371 ian0371 commented Jul 21, 2023

Proposed changes

We collect all commits between two preprepares.

collect      start                                     stop
---------------|--------------|------------|-------------|
state:     preprepared      prepared     commit      preprepared

Existing implementation discards "late commit messages" that we receive after turning into the "commit" state.
To receive those commit messages, we must define an independent data structure, hence Vrank struct.

Metrics

  • target: the time elapsed between the preprepared time (start time) and the followings:
    • the first commit arrival time
    • the [2f+1]-th commit arrival time
    • the average arrival times from the first commit to the [2f+1]-th commit
    • the last commit before stop

See further comments for an examplery metric output.

Logs

We assess each validator in the committee by how fast it sent its commit message:

  • early: received before threshold
  • late: received after threshold
  • not received

The threshold that divides early and late commits is defined as follows:

  • threshold = min( [2f+1]-th commit arrival time, preprepared time + 300ms )

This is logged as bitmap.
In addition, the arrival times of late commits are logged as late.

See further comments for log interpretation.

Types of changes

Please put an x in the boxes related to your change.

  • Bugfix
  • New feature or enhancement
  • Others

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING GUIDELINES doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes ($ make test)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Related issues

  • Please leave the issue numbers or links related to this PR here.

Further comments

Example metric output:

klaytn_vrank_avg_commit_within_quorum 1.944361e+06
klaytn_vrank_first_commit 1.520958e+06
klaytn_vrank_last_commit 6.891958e+06
klaytn_vrank_quorum_commit 2.290458e+06

Example log:

INFO[07/21,22:12:16 +09] [25|consensus/istanbul/core/vrank.go:118]  VRank     bitmap=01 late=[2]

It means that the fourth validator in the committee is late, and that its arrival time is 2ms.
Interpretation of bitmap: 01 => 0x01 => 0b00_00_00_01 => fourth 2-bit is vrankArrivedLate, others are vrankArrivedEarly.

Q. Why compress the log?
A. We wanted to reduce the log size as much as possible and reduce verbosity. If the threshold is [2f+1]-th commit arrival time, the size of late is f. If the threshold is preprepared time + 300ms, the size of late is more than f.

Q. Why not use existing variables such as consensusTimestamp, or currentView?
A. They are reset at startNewRound(), which is called almost immediately after turning into the "commit" state.

Q. Why threshold is 300ms?
A. In an ideal network, a validator is supposed to receive 2f+1 number of commits within 300ms.

consensus/istanbul/core/commit.go Show resolved Hide resolved
consensus/istanbul/core/vrank.go Outdated Show resolved Hide resolved
blukat29
blukat29 previously approved these changes Jul 24, 2023
hyeonLewis
hyeonLewis previously approved these changes Jul 24, 2023
@aidan-kwon aidan-kwon added the need to merge Need to merge for the next time label Jul 24, 2023
consensus/istanbul/core/vrank.go Show resolved Hide resolved
consensus/istanbul/core/vrank.go Outdated Show resolved Hide resolved
consensus/istanbul/core/vrank.go Outdated Show resolved Hide resolved
consensus/istanbul/core/vrank.go Show resolved Hide resolved
consensus/istanbul/core/vrank.go Outdated Show resolved Hide resolved
consensus/istanbul/core/vrank.go Show resolved Hide resolved
consensus/istanbul/core/vrank.go Outdated Show resolved Hide resolved
consensus/istanbul/core/vrank.go Outdated Show resolved Hide resolved
consensus/istanbul/core/vrank.go Outdated Show resolved Hide resolved
consensus/istanbul/core/vrank.go Show resolved Hide resolved
consensus/istanbul/core/vrank_test.go Outdated Show resolved Hide resolved
consensus/istanbul/core/vrank.go Outdated Show resolved Hide resolved
@ian0371 ian0371 dismissed stale reviews from hyeonLewis and blukat29 via 28665ae July 24, 2023 09:04
@JayChoi1736 JayChoi1736 mentioned this pull request Jul 24, 2023
20 tasks
hyunsooda
hyunsooda previously approved these changes Jul 25, 2023
yoomee1313
yoomee1313 previously approved these changes Jul 25, 2023
blukat29
blukat29 previously approved these changes Jul 25, 2023
hyeonLewis
hyeonLewis previously approved these changes Jul 25, 2023
jiseongnoh
jiseongnoh previously approved these changes Jul 26, 2023
@aidan-kwon aidan-kwon merged commit e14f74d into klaytn:dev Jul 26, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need to merge Need to merge for the next time
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants