Skip to content
This repository has been archived by the owner on Apr 18, 2024. It is now read-only.

Revamp caboose metrics #41

Merged
merged 11 commits into from
Feb 24, 2023
Merged

Conversation

aarshkshah1992
Copy link
Contributor

fixes #34

@aarshkshah1992 aarshkshah1992 marked this pull request as draft February 22, 2023 13:52
Base automatically changed from feat/downvote-based-return-codes to lock-cleanup February 22, 2023 17:44
Copy link
Contributor

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @aarshkshah1992, some initial feedback

metrics.go Outdated Show resolved Hide resolved
metrics.go Outdated Show resolved Hide resolved
metrics.go Outdated Show resolved Hide resolved
pool.go Outdated Show resolved Hide resolved
metrics.go Outdated Show resolved Hide resolved
pool.go Outdated Show resolved Hide resolved
@aarshkshah1992 aarshkshah1992 changed the title [WIP] Revamp caboose metrics Revamp caboose metrics Feb 23, 2023
@aarshkshah1992 aarshkshah1992 marked this pull request as ready for review February 23, 2023 13:22
@aarshkshah1992
Copy link
Contributor Author

@lidel Have addressed your review and added all the remaining metrics enlisted in #34.

This will make things easier when we have to add cars.
This gives us more useful buckets (same width)
It is a good practice to avoid defining buckets with values that may
change. By hardcoding bucket definitions this way, we avoid breaking
grafana boards when someone decides to adjust a timeout.

If we ever need to change buckets, or add new one with bigger max,
but keep old ones intact, we can define them by hand.
Copy link
Contributor

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @aarshkshah1992!

I've pushed cosmetic changes:

  • small renames to make it clear which metrics are about blocks. this is in preparation for future CAR/graph fetches (Support non-block requests #45) that will be done in addition to block ones
  • linear buckets for size (we will most likely see majority being ~256 KiB and ~1 MiB blocks on heatmap)
  • some precautions to not break grafana boards using histograms (see ℹ️ below)

Comment on lines +8 to +9
// Size buckets from 256 KiB (default chunk in Kubo) to 4MiB (maxBlockSize), 256 KiB wide each
blockSizeHistogram = prometheus.LinearBuckets(262144, 262144, 16)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ This gives us more useful buckets (same width), and we can plot them in Grafana using heatmap widget

Comment on lines +11 to +17
// TODO: Speed max bucket could use some further refinement,
// for now we don't expect speed being bigger than transfering 4MiB (max block) in 500ms
speedHistogram = prometheus.ExponentialBucketsRange(1, 4194304/500, 20)

// Duration max bucket is informed by the timeouts per block and per peer request/retry
durationPerBlockHistogram = prometheus.ExponentialBucketsRange(1, 60000, 10)
durationPerBlockPerPeerHistogram = prometheus.ExponentialBucketsRange(1, 20000, 10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ I've just remembered that it is a good practice to avoid defining buckets with values that may
change. By hardcoding bucket definitions this way, we avoid breaking grafana boards when someone decides to adjust a timeout.

If we ever need to change buckets, or add new one with bigger max, but keep old ones intact, we can define them as explicit list of values (like we did with legacy ones in kubo a while ago):

[]float64{0.05, 0.1, 0.25, 0.5, 1, 2, 5, 10, 30, 60}

@willscott willscott merged commit 8b0465e into lock-cleanup Feb 24, 2023
@willscott willscott deleted the feat/metrics-collection-work branch February 24, 2023 02:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants