Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-component metrics #1060

Closed
achingbrain opened this issue Dec 11, 2021 · 1 comment · Fixed by #1061
Closed

Per-component metrics #1060

achingbrain opened this issue Dec 11, 2021 · 1 comment · Fixed by #1061
Labels
need/triage Needs initial labeling and prioritization

Comments

@achingbrain
Copy link
Member

achingbrain commented Dec 11, 2021

I'd like to get some insight into what's going on inside a libp2p node. At the moment the metrics are a bit limited, only focussing on bandwidth stats for peers and protocols and few other bits, from what I can see.

It would be really useful to be able to get component-specific metrics to really see what's going on.

How about something like:

libp2p.metrics.updateComponentMetric(component: string, metric: string, value: number)

This would set per-component metrics, eg:

libp2p.metrics.updateComponentMetric('kad-dht', 'active-queries', 1)
libp2p.metrics.updateComponentMetric('dialler', 'pending-dials', 10)
// etc

Then getting metrics out:

libp2p.metrics.getComponentMetrics() => Map<string, Map<string, number>

We'd then report these metrics in the /debug/metrics/prometheus endpoint in ipfs-http-api-server which would allow graphing these stats over time when IPFS_MONITORING is set in the environment similar to how we do with the number of connected peers.

I don't think we need moving averages, or anything fancy like that - the tool we use to examine the stats can figure all of that out.

@achingbrain achingbrain added the need/triage Needs initial labeling and prioritization label Dec 11, 2021
achingbrain added a commit that referenced this issue Dec 13, 2021
Implements the idea from #1060 - allows us to get some insight into
what's happening in a libp2p node out side of just bandwidth stats.
@achingbrain achingbrain linked a pull request Dec 13, 2021 that will close this issue
@vasco-santos
Copy link
Member

I think this is super valuable! For further iteration, we should look into getting this closer to what we want to have with https://github.com/libp2p/observer-toolkit/tree/master/packages/proto

On previous hack week, I did a small POC in the daemon libp2p/js-libp2p-daemon#45 and I think this will likely be something we want to use in the daemon. But, there are useful things that we should extract on the libp2p repo that would make easier to consume things in IPFS and the observer toolkit

achingbrain added a commit that referenced this issue Dec 15, 2021
Implements the idea from #1060 - allows us to get some insight into what's happening in a libp2p node out side of just bandwidth stats.

Configures a few default metrics if metrics are enabled - current connections, the state of the dial queue, etc.

Also makes the `Metrics` class not depend on the `ConnectionManager` class, otherwise we can't collect simple metrics from the connection manager class due to the circular dependency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/triage Needs initial labeling and prioritization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants