-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define initial metrics to export to Prometheus as MVP #5
Comments
Since Prometheus will only gather numeric metrics, there are some things to consider when modeling the metrics.
Should be a label attached to all metrics.
Should also be a label, for each node we get (with the node being the top-level objects in the duct we are currently fetching).
This would happen outside of the exporter, either in Prometheus through Altermanager, or in Grafana.
Should work as
Here I wonder how the values are measured. Ideally, we could just record the total requests in a Gauge and let Prometheus infer the other metrics. Else having histograms for throughput might be fine, we just have to be careful with regards to statistically wrong double aggregations.
Clearly a gauge with a label per node.
Diff against time of the and record as
This could be a label, same as the node name.
Gauge with pool name as label. |
One question regarding freshness status: When I have a test network with 4 nodes, I get 3 freshness values, as you have posted above:
What does these numbers as keys (0,1,2) represent and how should we interpret them? |
Refactor PoolCollection
These metrics should be available on the auto-provisioned dashboards supplied with the monitoring stack. If anything else is needed or anything is missing a separate issue can be opened. |
Prometheus MVP Metrics
The Sovrin Network name being monitored
Should be able to get this from the pool being connected to
Node alias name
Detect when a node is inaccessible and produce standard output for that situation.
Should generate a timeout when trying to pull validator_info from inaccessible nodes.
Detect any nodes that are accessible but that are "unreachable" to some or all of the other Indy nodes.
The text was updated successfully, but these errors were encountered: