Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Metric to Track Greatest Epoch Final View #698

Merged
merged 30 commits into from Jun 15, 2021

Conversation

danuio
Copy link
Contributor

@danuio danuio commented May 10, 2021

@danuio danuio marked this pull request as draft May 10, 2021 10:06
@danuio danuio requested review from jordanschalm and removed request for jordanschalm May 10, 2021 10:07
@danuio danuio requested a review from jordanschalm May 10, 2021 10:07
@danuio danuio marked this pull request as ready for review May 12, 2021 09:11
Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One note to use the existing metrics field, otherwise looks great 👍

state/protocol/badger/mutator.go Outdated Show resolved Hide resolved
state/protocol/badger/mutator.go Outdated Show resolved Hide resolved
state/protocol/badger/mutator.go Outdated Show resolved Hide resolved
@franklywatson
Copy link
Contributor

Hi Danu. Would we not want to have unit tests for these changes?

state/protocol/badger/state.go Outdated Show resolved Hide resolved
state/protocol/badger/state.go Outdated Show resolved Hide resolved
@jordanschalm
Copy link
Member

jordanschalm commented May 13, 2021

Hey Danu, you can include tests for opening the state here and for bumping the view upon committing an epoch in the epoch flow test by using the mock metrics object rather than the noop.

danuio and others added 2 commits May 13, 2021 07:14
Co-authored-by: Jordan Schalm <jordan@dapperlabs.com>
Co-authored-by: Jordan Schalm <jordan@dapperlabs.com>
Copy link
Contributor

@vishalchangrani vishalchangrani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - just a reminder, you will have to Greenlist the metric as well by following this: https://www.notion.so/dapperlabs/Prometheus-metric-greenlisting-da74d2352b8f440185b1113270ba8bc5

I am also doing it for a metric that I have added - https://github.com/dapperlabs/dapper-flow-hosting/pull/174

@danuio danuio changed the base branch from master to feature/epochs May 14, 2021 14:59
@danuio danuio requested a review from zhangchiqing as a code owner May 14, 2021 14:59
@danuio danuio changed the base branch from feature/epochs to master May 14, 2021 14:59
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments. I would like to flag that I think the metric will not have a value once we restart the consensus nodes (we sometimes do this during mainnet fires):

  • the metric is only set during bootstrapping or once an EpochCommit service event is encountered. The latter could happen only once during an Epoch.
  • So if we restart the consensus node, the value for the metric might not be set anymore for the remainder of the epoch, which kind of defeats its purpose.

The suggestion I made will partially mitigate this issue, as the metric is then updated on every finalized block. However, we might also consider setting the metric value when recovering from a node crash, i.e. in this method:

func OpenState(

state/protocol/badger/state.go Outdated Show resolved Hide resolved
state/protocol/badger/mutator.go Outdated Show resolved Hide resolved
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the revisions. Looks good. Just a couple minor comments.

Another suggested I wanted to make is:

state/protocol/badger/state.go Outdated Show resolved Hide resolved
state/protocol/badger/state.go Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (master@746b50c). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #698   +/-   ##
=========================================
  Coverage          ?   56.43%           
=========================================
  Files             ?      423           
  Lines             ?    24816           
  Branches          ?        0           
=========================================
  Hits              ?    14006           
  Misses            ?     8915           
  Partials          ?     1895           
Flag Coverage Δ
unittests 56.43% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 746b50c...ccc7b1c. Read the comment docs.

@jordanschalm
Copy link
Member

@jordanschalm jordanschalm merged commit c628937 into master Jun 15, 2021
@jordanschalm jordanschalm deleted the danu/5373/add-metric-epoch-failure branch June 15, 2021 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants