Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitor service #9933

Merged
merged 19 commits into from
Nov 30, 2021
Merged

Monitor service #9933

merged 19 commits into from
Nov 30, 2021

Conversation

potuz
Copy link
Contributor

@potuz potuz commented Nov 24, 2021

This PR implements a validator monitor within the beacon node. It adds a feature flag --monitor-indices to the beacon-chain command taking a list of validator indices (not necessarily connected to the beacon). The beacon then logs messages and emits metrics in the following events

A beacon block is proposed by a tracked validator
A beacon block includes an attestation by a tracked validator
A beacon block includes a slashing of a tracked validator
A beacon block includes a voluntary exit of a tracked validator
A beacon block includes a sync-committee contribution by a tracked validator
An unaggregated attestation by a tracked validator was processed by the beacon node
An aggregated attestation by a tracked validator was processed by the beacon node
An aggregate attestation where the aggregator was a tracked validator, was processed by the beacon node
A voluntary exit by a tracked validator was processed by the beacon node
It also keeps a structure for aggregated performance since launch, but currently it does not log them

Collaboration with @terencechain

Notes for reviewers:

  • context and cancel usage in this service may be completely bogus and unnecessary.

@potuz potuz added the Ready For Review A pull request ready for code review label Nov 24, 2021
@potuz potuz mentioned this pull request Nov 27, 2021
5 tasks
beacon-chain/monitor/service.go Outdated Show resolved Hide resolved
Comment on lines 115 to 118
log.WithFields(logrus.Fields{
"ValidatorIndices": tracked,
}).Info("Started service")
s.isRunning = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We log out that the service started, but s.isRunning = false. I find this a bit contradictory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service starts, but it is not logging anything until the beacon syncs. The error message on localhost:3500/healtz is "not running". I don't mind changing this though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Radek's comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the log to "Starting service" and the boolean to "isLogging", would that be more accurate?

beacon-chain/monitor/service.go Outdated Show resolved Hide resolved
beacon-chain/monitor/service.go Outdated Show resolved Hide resolved
beacon-chain/monitor/service.go Outdated Show resolved Hide resolved
beacon-chain/monitor/service.go Outdated Show resolved Hide resolved
beacon-chain/monitor/service.go Outdated Show resolved Hide resolved
trackedSyncCommitteeIndices: make(map[types.ValidatorIndex][]types.CommitteeIndex),
}
for _, idx := range tracked {
r.TrackedValidators[idx] = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This map is not filled with real data in this PR. I presume it's done somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that the key is there is used to check that the idx is tracked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of assigning nil, why don't we assign the actual struct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean one of the performance structures? because those take a long time to be available, this would block service start until we are fully synced at the minimum.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just change make(map[types.ValidatorIndex]interface{} to make(map[types.ValidatorIndex]bool. bool is used most of the time in prysm for tracking existence

beacon-chain/node/node.go Show resolved Hide resolved
beacon-chain/node/node.go Show resolved Hide resolved
rkapka
rkapka previously approved these changes Nov 30, 2021
Copy link
Member

@terencechain terencechain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required changes:

  • Feedbacks with start function (ie refactor, go routine... etc)

Optional changes:

  • Tests

func (s *Service) Start() {
tracked := make([]types.ValidatorIndex, len(s.TrackedValidators))
i := 0
for idx := range s.TrackedValidators {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for i:=0; i < len(s.TrackedValidators ).... is better for this imo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry didn't get this, how do I grab the keys from the map?

Comment on lines 115 to 118
log.WithFields(logrus.Fields{
"ValidatorIndices": tracked,
}).Info("Started service")
s.isRunning = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Radek's comment

Comment on lines 123 to 159
if err := s.waitForSync(stateChannel, stateSub); err != nil {
log.WithError(err)
return
}
state, err := s.config.HeadFetcher.HeadState(s.ctx)
if err != nil {
log.WithError(err).Error("Could not get head state")
return
}
if state == nil {
log.Error("Head state is nil")
return
}
epoch := slots.ToEpoch(state.Slot())
log.WithField("Epoch", epoch).Debug("Synced to head epoch, starting reporting performance")

s.Lock()
for idx := range s.TrackedValidators {
balance, err := state.BalanceAtIndex(idx)
if err != nil {
log.WithError(err).WithField("ValidatorIndex", idx).Error(
"Could not fetch starting balance, skipping aggregated logs.")
balance = 0
}
s.aggregatedPerformance[idx] = ValidatorAggregatedPerformance{
startEpoch: epoch,
startBalance: balance,
}
s.latestPerformance[idx] = ValidatorLatestPerformance{
balance: balance,
}
}
s.Unlock()
s.updateSyncCommitteeTrackedVals(state)
s.Lock()
s.isRunning = true
s.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think these should be in the background routine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't solve the tests without sending these to the background, there are lots of races when subscribing to the channels.

trackedSyncCommitteeIndices: make(map[types.ValidatorIndex][]types.CommitteeIndex),
}
for _, idx := range tracked {
r.TrackedValidators[idx] = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of assigning nil, why don't we assign the actual struct

Comment on lines 140 to 154
for idx := range s.TrackedValidators {
balance, err := state.BalanceAtIndex(idx)
if err != nil {
log.WithError(err).WithField("ValidatorIndex", idx).Error(
"Could not fetch starting balance, skipping aggregated logs.")
balance = 0
}
s.aggregatedPerformance[idx] = ValidatorAggregatedPerformance{
startEpoch: epoch,
startBalance: balance,
}
s.latestPerformance[idx] = ValidatorLatestPerformance{
balance: balance,
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider refactoring this into its own helper

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, moving this to it's own helper allows me to add an easy test for it.

Comment on lines 115 to 117
log.WithFields(logrus.Fields{
"ValidatorIndices": tracked,
}).Info("Started service")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log this at the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my runtime tests it was reassuring getting the confirmation at launch that I was tracking the right set of validators. Vide the reply above about changing this message.

beacon-chain/monitor/service.go Show resolved Hide resolved
beacon-chain/monitor/service.go Outdated Show resolved Hide resolved
Copy link
Member

@terencechain terencechain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

trackedSyncCommitteeIndices: make(map[types.ValidatorIndex][]types.CommitteeIndex),
}
for _, idx := range tracked {
r.TrackedValidators[idx] = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just change make(map[types.ValidatorIndex]interface{} to make(map[types.ValidatorIndex]bool. bool is used most of the time in prysm for tracking existence

@@ -35,6 +35,7 @@ func (s *Service) processBlock(ctx context.Context, b block.SignedBeaconBlock) {
}
state := s.config.StateGen.StateByRootIfCachedNoCopy(root)
if state == nil {
log.Info("Pingo")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐧 ?

@potuz potuz merged commit afbe026 into develop Nov 30, 2021
@delete-merged-branch delete-merged-branch bot deleted the monitor_service branch November 30, 2021 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready For Review A pull request ready for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants