-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monitor service #9933
Monitor service #9933
Conversation
beacon-chain/monitor/service.go
Outdated
log.WithFields(logrus.Fields{ | ||
"ValidatorIndices": tracked, | ||
}).Info("Started service") | ||
s.isRunning = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We log out that the service started, but s.isRunning = false
. I find this a bit contradictory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The service starts, but it is not logging anything until the beacon syncs. The error message on localhost:3500/healtz is "not running". I don't mind changing this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to Radek's comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the log to "Starting service" and the boolean to "isLogging", would that be more accurate?
beacon-chain/monitor/service.go
Outdated
trackedSyncCommitteeIndices: make(map[types.ValidatorIndex][]types.CommitteeIndex), | ||
} | ||
for _, idx := range tracked { | ||
r.TrackedValidators[idx] = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This map is not filled with real data in this PR. I presume it's done somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that the key is there is used to check that the idx
is tracked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of assigning nil, why don't we assign the actual struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean one of the performance structures? because those take a long time to be available, this would block service start until we are fully synced at the minimum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just change make(map[types.ValidatorIndex]interface{}
to make(map[types.ValidatorIndex]bool
. bool
is used most of the time in prysm for tracking existence
0f2ce65
to
15f91ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Required changes:
- Feedbacks with start function (ie refactor, go routine... etc)
Optional changes:
- Tests
func (s *Service) Start() { | ||
tracked := make([]types.ValidatorIndex, len(s.TrackedValidators)) | ||
i := 0 | ||
for idx := range s.TrackedValidators { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for i:=0; i < len(s.TrackedValidators )....
is better for this imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry didn't get this, how do I grab the keys from the map?
beacon-chain/monitor/service.go
Outdated
log.WithFields(logrus.Fields{ | ||
"ValidatorIndices": tracked, | ||
}).Info("Started service") | ||
s.isRunning = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to Radek's comment
beacon-chain/monitor/service.go
Outdated
if err := s.waitForSync(stateChannel, stateSub); err != nil { | ||
log.WithError(err) | ||
return | ||
} | ||
state, err := s.config.HeadFetcher.HeadState(s.ctx) | ||
if err != nil { | ||
log.WithError(err).Error("Could not get head state") | ||
return | ||
} | ||
if state == nil { | ||
log.Error("Head state is nil") | ||
return | ||
} | ||
epoch := slots.ToEpoch(state.Slot()) | ||
log.WithField("Epoch", epoch).Debug("Synced to head epoch, starting reporting performance") | ||
|
||
s.Lock() | ||
for idx := range s.TrackedValidators { | ||
balance, err := state.BalanceAtIndex(idx) | ||
if err != nil { | ||
log.WithError(err).WithField("ValidatorIndex", idx).Error( | ||
"Could not fetch starting balance, skipping aggregated logs.") | ||
balance = 0 | ||
} | ||
s.aggregatedPerformance[idx] = ValidatorAggregatedPerformance{ | ||
startEpoch: epoch, | ||
startBalance: balance, | ||
} | ||
s.latestPerformance[idx] = ValidatorLatestPerformance{ | ||
balance: balance, | ||
} | ||
} | ||
s.Unlock() | ||
s.updateSyncCommitteeTrackedVals(state) | ||
s.Lock() | ||
s.isRunning = true | ||
s.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think these should be in the background routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't solve the tests without sending these to the background, there are lots of races when subscribing to the channels.
beacon-chain/monitor/service.go
Outdated
trackedSyncCommitteeIndices: make(map[types.ValidatorIndex][]types.CommitteeIndex), | ||
} | ||
for _, idx := range tracked { | ||
r.TrackedValidators[idx] = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of assigning nil, why don't we assign the actual struct
beacon-chain/monitor/service.go
Outdated
for idx := range s.TrackedValidators { | ||
balance, err := state.BalanceAtIndex(idx) | ||
if err != nil { | ||
log.WithError(err).WithField("ValidatorIndex", idx).Error( | ||
"Could not fetch starting balance, skipping aggregated logs.") | ||
balance = 0 | ||
} | ||
s.aggregatedPerformance[idx] = ValidatorAggregatedPerformance{ | ||
startEpoch: epoch, | ||
startBalance: balance, | ||
} | ||
s.latestPerformance[idx] = ValidatorLatestPerformance{ | ||
balance: balance, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider refactoring this into its own helper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, moving this to it's own helper allows me to add an easy test for it.
beacon-chain/monitor/service.go
Outdated
log.WithFields(logrus.Fields{ | ||
"ValidatorIndices": tracked, | ||
}).Info("Started service") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we log this at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my runtime tests it was reassuring getting the confirmation at launch that I was tracking the right set of validators. Vide the reply above about changing this message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
beacon-chain/monitor/service.go
Outdated
trackedSyncCommitteeIndices: make(map[types.ValidatorIndex][]types.CommitteeIndex), | ||
} | ||
for _, idx := range tracked { | ||
r.TrackedValidators[idx] = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just change make(map[types.ValidatorIndex]interface{}
to make(map[types.ValidatorIndex]bool
. bool
is used most of the time in prysm for tracking existence
@@ -35,6 +35,7 @@ func (s *Service) processBlock(ctx context.Context, b block.SignedBeaconBlock) { | |||
} | |||
state := s.config.StateGen.StateByRootIfCachedNoCopy(root) | |||
if state == nil { | |||
log.Info("Pingo") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐧 ?
This PR implements a validator monitor within the beacon node. It adds a feature flag --monitor-indices to the beacon-chain command taking a list of validator indices (not necessarily connected to the beacon). The beacon then logs messages and emits metrics in the following events
A beacon block is proposed by a tracked validator
A beacon block includes an attestation by a tracked validator
A beacon block includes a slashing of a tracked validator
A beacon block includes a voluntary exit of a tracked validator
A beacon block includes a sync-committee contribution by a tracked validator
An unaggregated attestation by a tracked validator was processed by the beacon node
An aggregated attestation by a tracked validator was processed by the beacon node
An aggregate attestation where the aggregator was a tracked validator, was processed by the beacon node
A voluntary exit by a tracked validator was processed by the beacon node
It also keeps a structure for aggregated performance since launch, but currently it does not log them
Collaboration with @terencechain
Notes for reviewers: