Expose prometheus metrics per node for current total pins queued or errored #1470

olizilla · 2021-09-15T09:50:59Z

We'd like to be able to scrape totals from each cluster node for pins that are in pin_queued or pin_error states, so we can chart them over time and alert on them when the numbers get high.

Right now we run ipfs-cluster-ctl manually when someone reports degraded service, and those are the things we check for, so it would be great to be able to automate it.

The text was updated successfully, but these errors were encountered:

This fixes #1470 and #1187.

olizilla · 2022-04-22T14:38:14Z

🎉 nice!

olizilla added the need/triage Needs initial labeling and prioritization label Sep 15, 2021

hsanjuan added effort/days Estimated to take multiple days, but less than a week exp/intermediate Prior experience is likely helpful P1 High: Likely tackled by core team if no one steps up status/ready Ready to be worked labels Sep 15, 2021

hsanjuan added this to the Release v1.0.0 milestone Apr 22, 2022

hsanjuan added a commit that referenced this issue Apr 22, 2022

metrics: track total pins, queued, pinning, pin error.

3169fba

This fixes #1470 and #1187.

hsanjuan mentioned this issue Apr 22, 2022

metrics: track total pins, queued, pinning, pin error. #1637

Merged

hsanjuan closed this as completed in #1637 Apr 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose prometheus metrics per node for current total pins queued or errored #1470

Expose prometheus metrics per node for current total pins queued or errored #1470

olizilla commented Sep 15, 2021

olizilla commented Apr 22, 2022

Expose prometheus metrics per node for current total pins queued or errored #1470

Expose prometheus metrics per node for current total pins queued or errored #1470

Comments

olizilla commented Sep 15, 2021

olizilla commented Apr 22, 2022