-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: add metrics for chunk and system storage space #360
Comments
@joshuef I'm replying to your request for feedback on logfile messages to keep things in one place. The following are messages strings I am currently matching for the given metrics, while the OP suggests additional metrics it may be useful to have but which we did have at one stage. PUT: "Wrote record to disk" I have a crude mechanism for categorising node status as: Stopped|Connecting|Connected|Disconnected. I don't know if a node could provide a more definitive state message as a periodic output to the logfile (not just on change), but if so that would improve accuracy. I'm not sure what would be helpful to add in other areas both for devs, if you use Of course I'm open to any suggestions for things that your team would like. |
Would be hard as they're all records now. (hard as in reading from disk). So i'd be inclined to just leave that to a sys level check of the We have eg: https://github.com/maidsafe/safe_network/blob/main/sn_node/src/log_markers.rs#L21 Connecting would just be everything before that. Though perhaps we can add an initial message when we start attempting to connect to the first peers (could be done via #518)
Hmm, outwith of connecting/stopped it would (in theory) just always be connected. Not sure if that's that useful? (Or are you imagining more states? We have some can't say we as a team use vdash as yet. Everything is headless and we're just looking at grabbing basic stats of nodes to determine any major issues that may be in play thus far |
Yes, whatever is easy and useful. For state: starting,connecting(ed) etc I'm envisaging it as a proxy for "things seem to be ok, or not". So losing connectivity, being stopped etc. or for anything that might reasonably happen that the operator might want to know about. So while connected a periodic "all ok" type message could be logged and I would flag it as a problem if this wasn't seen for too long. As well as showing any other states beforehand. I'm really not sure what is best here, but think it is useful to have something in the dashboard that the operator can look at and instantly go, oh a that's not right. Showing number of peers sounds good. Any other suggestions welcome as I don't spend much time analysing logs or thinking about the ATM. I can work with what we have but wanted to see if you thought it worthwhile exposing more general state like info. Thanks for looking at this. |
For the mo, I think the kbucket logs are a proxy there. If we lose everything / are in decline something is up. As we get to know the network, the kbucket may fluctuate a bit, but really should not be descreasing in peer count. Anyone out should be replaced as long as the network is healthy eg. |
So if I display peer count and a max peer count, maybe red on some condition. What would you suggest? |
There is a new |
The logs used to include the following metrics which I displayed in
vdash
and think would be useful to have again. So I wonder if they can be added to the metrics module insafenode
:I can't get these from the local system because there may be multiple
safenode
processes running, andvdash
monitors multiple nodes by displaying any one of any number of availablesafenode.log
files.The text was updated successfully, but these errors were encountered: