Expose prometheus metrics per node for current total pins queued or errored #1470
Labels
effort/days
Estimated to take multiple days, but less than a week
exp/intermediate
Prior experience is likely helpful
need/triage
Needs initial labeling and prioritization
P1
High: Likely tackled by core team if no one steps up
status/ready
Ready to be worked
Milestone
We'd like to be able to scrape totals from each cluster node for pins that are in
pin_queued
orpin_error
states, so we can chart them over time and alert on them when the numbers get high.Right now we run
ipfs-cluster-ctl
manually when someone reports degraded service, and those are the things we check for, so it would be great to be able to automate it.The text was updated successfully, but these errors were encountered: