chore: improve tracing in network monitor#1366
Conversation
7c6ec2b to
97531ec
Compare
|
@bobbinth @Mirko-von-Leipzig |
There was a problem hiding this comment.
Not sure if the plan is still to merge this and patch release it, but LGTM! Sorry for the delay in the review.
Not super familiar with the frequencies of some processes. Doesn't seem like these changes would be very spammy but it's probably something to watch out for.
| debug!(target: COMPONENT, "Initializing monitoring tasks"); | ||
|
|
||
| // Initialize the RPC Status endpoint checker task. | ||
| debug!(target: COMPONENT, "Initializing RPC status checker"); |
There was a problem hiding this comment.
nit: Emitting 2 events here feels a bit redundant/unnecessary
There was a problem hiding this comment.
I removed the first one. For the debug! level I left one log per component.
| level = "info", | ||
| ret(level = "debug") | ||
| )] | ||
| pub(crate) async fn check_remote_prover_status( |
There was a problem hiding this comment.
Something that could be nice for troubleshooting is to raise info events for service status changes. Something like
let mut last = Status::Unknown;
// ...
let status = check_remote_prover_status(...).await;
if status.status != last {
info!(target: COMPONENT, prover = %name, status = ?status.status, "Remote prover status changed");
last = status.status;
}But maybe this is already easily filterable with the current traces
There was a problem hiding this comment.
In line 443 we have this:
debug!(target: COMPONENT, prover_name = %name, remote_prover_status = ?status, "Remote prover status check successful");Is that what you meant?
There was a problem hiding this comment.
Mostly meant emitting events when the status changes (so for example, when a service goes from healthy to unhealthy or viceversa). It feels like it could be useful for troubleshooting, but maybe not.
There was a problem hiding this comment.
Probably, but the idea of the monitor is not troubleshooting but a quick glance of the overall state of the components
Yeah, maybe now we want to change the base to
Maybe this can be controlled changing the sample rate if we decide that we need to reduce it? That will require some probably minor changes in the utils crate. |
01eeccc to
fdee8ba
Compare
fdee8ba to
8791a06
Compare
This PR uses
mainas base branchImproves the instrumentation in the
miden-network-monitorbinary.This is the current view of the traces (in jaeger):