Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Back Error/Debug Logs #7922

Merged
merged 4 commits into from
Nov 23, 2020
Merged

Add Back Error/Debug Logs #7922

merged 4 commits into from
Nov 23, 2020

Conversation

nisdas
Copy link
Member

@nisdas nisdas commented Nov 23, 2020

What type of PR is this?

UX

What does this PR do? Why is it needed?

-[x] In #7914 some important sync logs were removed or pushed down to debug. This PR adds them back.

Which issues(s) does this PR fix?

N.A

Other notes for review

farazdagi
farazdagi previously approved these changes Nov 23, 2020
@@ -92,6 +93,7 @@ func (s *Service) validateBeaconBlockPubSub(ctx context.Context, pid peer.ID, ms
s.pendingQueueLock.RUnlock()

if err := helpers.VerifySlotTime(uint64(s.chain.GenesisTime().Unix()), blk.Block.Slot, params.BeaconNetworkConfig().MaximumGossipClockDisparity); err != nil {
log.WithError(err).WithField("blockSlot", blk.Block.Slot).Debug("Rejecting incoming block.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert this line?

1.) It's odd to have Rejecting incoming block just for this time check. If we do want this then we should add this rejecting incoming back to every validation step. (Even for debug)

2.) This hasn't been relevant since we had the time bug

3.) Maybe we also can downgrade it to trace? and something like this for all the validation steps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its pretty important to know if a block has been rejected, unlike attestations blocks occur only once every 12 seconds. Node operators will have very little idea if their local clocks are in sync with the rest of the network if these logs disappear. They would see poor attestation performance, voting on the wrong head etc but wouldn't be able to tie it in with an out of sync clock if this does happen. Rejecting a block happens rarely enough that it is fine to keep it in debug. On 2) , that isn't really true, a lot of users still see that log locally when their clock is out of sync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. My point is we should value consistency here. Not just have debug logging for 1-2 of the steps (due to previous incidents) and not all the steps. What do you think about?

1.) Add debug logging for rest of the steps (e.g. block descendants, signature... etc)

2.) Conform to the same prefix. (e.g. Could not validate incoming back: )

Let me know if you want to do that here? I'm happy to do it as well either push it to this PR or open a new PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 👍 , agree with standardizing logs for all our debug messages here otherwise its ugly looking. I can fix this tomorrow too, or you are welcome to also clean it up if you are free.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. I can help. Thank you!

@@ -80,6 +80,7 @@ func (s *Service) validateBeaconBlockPubSub(ctx context.Context, pid peer.ID, ms
}
// Check if parent is a bad block and then reject the block.
if s.hasBadBlock(bytesutil.ToBytes32(blk.Block.ParentRoot)) {
log.Debugf("Received block with root %#x that has an invalid parent %#x", blockRoot, blk.Block.ParentRoot)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert this line?

1.) It's odd to have a particular debug just for hasBadBlock. If we do want to keep it then we should add it for the rest of the validation step

2.) I think this was only needed for debugging a bug, this hasn't shown up in a while

3.) Maybe we also can downgrade it to trace? and something like this for all the validation steps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still relevant, users have reported seeing this log and right after their node being stuck before resync saves it. We still have an unresolved issue with the proposer index cache which leads to any of its descendant blocks being rejected despite all blocks being valid. Removing this debug log, makes it hard to really know what is happening.

@terencechain terencechain added the Ready For Review A pull request ready for code review label Nov 23, 2020
@terencechain terencechain changed the base branch from master to v1.1.0 November 23, 2020 17:12
@terencechain terencechain changed the base branch from v1.1.0 to master November 23, 2020 17:13
@prestonvanloon prestonvanloon added this to the Diamond milestone Nov 23, 2020
@prylabs-bulldozer prylabs-bulldozer bot merged commit 8c3faaa into master Nov 23, 2020
@delete-merged-branch delete-merged-branch bot deleted the addBackLogs branch November 23, 2020 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready For Review A pull request ready for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants