Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging improvements #929

Merged
merged 3 commits into from
Mar 29, 2022
Merged

Logging improvements #929

merged 3 commits into from
Mar 29, 2022

Conversation

AnomalRoil
Copy link
Member

  • Using With for prefixes
  • Logs cleanups
  • Trying to add more useful information in logs, including node address and index

Before:

2022-03-07T18:13:08.781+0100	DEBUG	beacon/chain.go:144		{"beacon_id": "", "store_partial": "127.0.0.1:37900", "round": 1, "len_partials": "2/2"}
2022-03-07T18:13:08.781+0100	DEBUG	beacon/node.go:106		{"beacon_id": "", "received": "request", "from": "127.0.0.1:38166", "round": 1}
2022-03-07T18:13:08.784+0100	DEBUG	beacon/node.go:136		{"beacon_id": "", "process_partial": "127.0.0.1:38166", "prev_sig": "bffcfc", "curr_round": 1, "msg_sign": "e38cc7", "from_node": "127.0.0.1:34863", "status": "OK"}
2022-03-07T18:13:08.785+0100	DEBUG	beacon/node.go:136		{"beacon_id": "", "process_partial": "127.0.0.1:40218", "prev_sig": "bffcfc", "curr_round": 1, "msg_sign": "e38cc7", "from_node": "127.0.0.1:34863", "status": "OK"}
2022-03-07T18:13:08.791+0100	INFO	beacon/chain.go:169		{"beacon_id": "", "aggregated_beacon": 1}
2022-03-07T18:13:08.805+0100	INFO	beacon/store.go:120		{"beacon_id": "", "NEW_BEACON_STORED": "{ round: 1, sig: 9430a5, prevSig: bffcfc }", "time_discrepancy_ms": 805.320942}
2022-03-07T18:13:08.805+0100	DEBUG	beacon/chain.go:127		{"beacon_id": "", "ignoring_partial": 1, "last_beacon_stored": 1}
2022-03-07T18:13:08.806+0100	INFO	beacon/store.go:120		{"beacon_id": "", "NEW_BEACON_STORED": "{ round: 1, sig: 9430a5, prevSig: bffcfc }", "time_discrepancy_ms": 806.974191}
2022-03-07T18:13:08.807+0100	DEBUG	beacon/chain.go:127		{"beacon_id": "", "ignoring_partial": 1, "last_beacon_stored": 

After:

2022-03-07T19:50:32.663+0100	DEBUG	beacon/chain.go:144		{"idx": 0, "addr": "127.0.0.1:43151", "beacon_id": "", "store_partial": "127.0.0.1:43151", "round": 1, "len_partials": "1/2"}
2022-03-07T19:50:32.663+0100	DEBUG	beacon/node.go:105		{"idx": 1, "addr": "127.0.0.1:37497", "beacon_id": "", "received": "request", "from": "127.0.0.1:39508", "round": 1}
2022-03-07T19:50:32.671+0100	DEBUG	beacon/node.go:135		{"idx": 0, "addr": "127.0.0.1:43151", "beacon_id": "", "process_partial": "127.0.0.1:60694", "prev_sig": "b70359", "curr_round": 1, "msg_sign": "d4a4c2", "from_node": "127.0.0.1:37497", "status": "OK"}
2022-03-07T19:50:32.671+0100	DEBUG	beacon/node.go:135		{"idx": 2, "addr": "127.0.0.1:35643", "beacon_id": "", "process_partial": "127.0.0.1:50120", "prev_sig": "b70359", "curr_round": 1, "msg_sign": "d4a4c2", "from_node": "127.0.0.1:37497", "status": "OK"}
2022-03-07T19:50:32.671+0100	DEBUG	beacon/chain.go:144		{"idx": 0, "addr": "127.0.0.1:43151", "beacon_id": "", "store_partial": "127.0.0.1:60694", "round": 1, "len_partials": "2/2"}
2022-03-07T19:50:32.671+0100	DEBUG	beacon/node.go:135		{"idx": 1, "addr": "127.0.0.1:37497", "beacon_id": "", "process_partial": "127.0.0.1:39510", "prev_sig": "b70359", "curr_round": 1, "msg_sign": "d4a4c2", "from_node": "127.0.0.1:35643", "status": "OK"}
2022-03-07T19:50:32.678+0100	INFO	beacon/chain.go:169		{"idx": 1, "addr": "127.0.0.1:37497", "beacon_id": "", "aggregated_beacon": 1}
2022-03-07T19:50:32.693+0100	INFO	beacon/store.go:119		{"idx": 1, "addr": "127.0.0.1:37497", "beacon_id": "", "NEW_BEACON_STORED": "{ round: 1, sig: ade005, prevSig: b70359 }", "time_discrepancy_ms": 693.710468}
2022-03-07T19:50:32.693+0100	DEBUG	beacon/chain.go:127		{"idx": 1, "addr": "127.0.0.1:37497", "beacon_id": "", "ignoring_partial": 1, "last_beacon_stored": 1}

@AnomalRoil
Copy link
Member Author

Notice I'm not sure I've covered all cases when resharing happens, espcially in the case of an existing node.

See:

drand/core/drand_beacon.go

Lines 232 to 244 in d9099c9

// tell the current beacon to stop just before the new network starts
if oldPresent {
bp.beacon.TransitionNewGroup(newShare, newGroup)
} else {
b, err := bp.newBeacon()
if err != nil {
bp.log.Fatalw("", "beacon_id", beaconID, "transition", "new_node", "err", err)
}
if err := b.Transition(oldGroup); err != nil {
bp.log.Errorw("", "beacon_id", beaconID, "sync_before", err)
}
bp.log.Infow("", "beacon_id", beaconID, "transition_new", "done")
}

I might need to adjust to use the proper prefixes in the TransitionNewGroup function, what do you think?

The index has a good chance to change, as far as I understand it, but I'm not sure about the other things such as the address or the beacon_id, I don't think so.

@AnomalRoil
Copy link
Member Author

Seems I have triggered some race conditions on our loggers, probably because the DefaultLogger seems to be a unique, global logger, and I might need to create new loggers instead of copying that one around.

@nikkolasg
Copy link
Collaborator

I might need to adjust to use the proper prefixes in the TransitionNewGroup function, what do you think?

That's correct.
As a suggestion, we could try to think as a general event that the transition is happening, where we can attach actions to it. One action is changing the log. Another one later on could be maybe signalling off-drand (telegram, etc) that the transition happened (thinking about the dashboard, more interactivity, more feedback from drand etc).

@AnomalRoil
Copy link
Member Author

I think your idea @nikkolasg to have an event signalling the transition might be nice, but I couldn't get around the race conditions in the loggers if I let the logger change while go routines relying on the former logger still exist. So I'm doing the index changes "a bit too early" right now, instead of doing them at the time the transition occurs.

log/log.go Outdated Show resolved Hide resolved
@willscott
Copy link
Member

I see this pr removing the beacon_id annotations from individual log lines. Where is the with that adds that annotation back in?

@AnomalRoil
Copy link
Member Author

AnomalRoil commented Mar 25, 2022

It's added to all logs using Named now.

So a Named logger is e.g.:
101.201.13.104:35156.default.1
Means this IP, 101.201.13.104:35156 running the beacon id default and is the node number 1 in its group file.

@AnomalRoil
Copy link
Member Author

So, contrarily to the initial message in this PR, the logs now looks like this:

2022-03-28T11:05:48.778+0200	INFO	127.0.0.1:43953.default.0	beacon/chain.go:169		{"aggregated_beacon": 2}
2022-03-28T11:05:48.785+0200	INFO	127.0.0.1:38459.default.2	beacon/store.go:123		{"NEW_BEACON_STORED": "{ round: 2, sig: aee814, prevSig: 8fee9e }", "time_discrepancy_ms": 785.904003}
2022-03-28T11:05:48.786+0200	DEBUG	127.0.0.1:38459.default.2	beacon/chain.go:127		{"ignoring_partial": 2, "last_beacon_stored": 2}
2022-03-28T11:05:48.786+0200	INFO	127.0.0.1:43953.default.0	beacon/store.go:123		{"NEW_BEACON_STORED": "{ round: 2, sig: aee814, prevSig: 8fee9e }", "time_discrepancy_ms": 786.192475}
2022-03-28T11:05:48.786+0200	DEBUG	127.0.0.1:43953.default.0	beacon/chain.go:127		{"ignoring_partial": 2, "last_beacon_stored": 2}
2022-03-28T11:05:48.788+0200	INFO	127.0.0.1:43787.default.1	beacon/chain.go:169		{"aggregated_beacon": 2}
2022-03-28T11:05:48.792+0200	INFO	127.0.0.1:43787.default.1	beacon/store.go:123		{"NEW_BEACON_STORED": "{ round: 2, sig: aee814, prevSig: 8fee9e }", "time_discrepancy_ms": 792.552396}
2022-03-28T11:05:48.792+0200	DEBUG	127.0.0.1:43787.default.1	beacon/chain.go:127		{"ignoring_partial": 2, "last_beacon_stored": 2}
2022-03-28T11:05:52.657+0200	DEBUG	127.0.0.1:43787.default.1	core/drand_beacon_control.go:932		{"phaser_finished": "deal"}

@willscott
Copy link
Member

note that changing to logging format are breaking changes, because we need all members to update how they're parsing logs to handle the new format.

@AnomalRoil
Copy link
Member Author

@willscott In any case we are breaking the log format with master by moving from Kit logger to Zap.

Here is how the Kit logger currently logs things:

Mar 28 07:50:00 drand.mainnet3.drand drand level=error ts="Mon, 28 Mar 2022 14:50:00 UTC" call=protocol_grpc.pb.go:265 process_partial=10.0.16.7:50538 err="bls: invalid signature" prev_sig=b1f054 curr_round=1768266 msg_sign=0f3afb short_pub=8a7f9
Mar 28 07:50:00 drand.mainnet3.drand drand level=info ts="Mon, 28 Mar 2022 14:50:00 UTC" call=:0 aggregated_beacon=1768266
Mar 28 07:50:00 drand.mainnet3.drand drand level=info ts="Mon, 28 Mar 2022 14:50:00 UTC" call=chain.go:180 NEW_BEACON_STORED="{ round: 1768266, sig: 89155a, prevSig: 8e67fd }" time_discrepancy_ms=817.535187

@AnomalRoil AnomalRoil merged commit c90dc31 into master Mar 29, 2022
@AnomalRoil AnomalRoil deleted the log/With branch March 29, 2022 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants