Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

safekeeper: eviction metrics #8348

Merged
merged 3 commits into from
Jul 11, 2024
Merged

safekeeper: eviction metrics #8348

merged 3 commits into from
Jul 11, 2024

Conversation

jcsp
Copy link
Collaborator

@jcsp jcsp commented Jul 10, 2024

Problem

Follow up to #8335, to improve observability of how many evict/restores we are doing.

Summary of changes

  • Add safekeeper_eviction_events_started_total and safekeeper_eviction_events_completed_total, with a "kind" label of evict or restore. This gives us rates, and also ability to calculate how many are in progress.
  • Generalize SafekeeperMetrics test type to use the same helpers as pageserver, and enable querying any metric.
  • Read the new metrics at the end of the eviction test.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added c/storage/safekeeper Component: storage: safekeeper a/tech_debt Area: related to tech debt labels Jul 10, 2024
@jcsp jcsp requested a review from arssher July 10, 2024 21:24
@jcsp jcsp changed the title Jcsp/safekeeper eviction metrics safekeeper: eviction metrics Jul 10, 2024
Copy link

github-actions bot commented Jul 10, 2024

3042 tests run: 2927 passed, 0 failed, 115 skipped (full report)


Flaky tests (1)

Postgres 16

  • test_tenant_creation_fails: debug

Code coverage* (full report)

  • functions: 32.7% (6969 of 21344 functions)
  • lines: 50.1% (54797 of 109473 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
4f70485 at 2024-07-11T08:20:19.197Z :recycle:

@jcsp jcsp force-pushed the jcsp/safekeeper-eviction-metrics branch from 94b79d6 to 4f70485 Compare July 11, 2024 07:24
@jcsp jcsp marked this pull request as ready for review July 11, 2024 12:05
@jcsp jcsp requested a review from a team as a code owner July 11, 2024 12:05
@jcsp jcsp merged commit 0159ae9 into main Jul 11, 2024
66 checks passed
@jcsp jcsp deleted the jcsp/safekeeper-eviction-metrics branch July 11, 2024 16:05
skyzh pushed a commit that referenced this pull request Jul 15, 2024
## Problem

Follow up to #8335, to improve
observability of how many evict/restores we are doing.

## Summary of changes

- Add `safekeeper_eviction_events_started_total` and
`safekeeper_eviction_events_completed_total`, with a "kind" label of
evict or restore. This gives us rates, and also ability to calculate how
many are in progress.
- Generalize SafekeeperMetrics test type to use the same helpers as
pageserver, and enable querying any metric.
- Read the new metrics at the end of the eviction test.
jcsp added a commit that referenced this pull request Jul 18, 2024
## Problem

This test would occasionally fail its metric check. This could happen in
the rare case that the nodes had all been restarted before their most
recent eviction.

The metric check was added in
#8348

## Summary of changes

- Check metrics before each restart, accumulate into a bool that we
assert on at the end of the test
problame pushed a commit that referenced this pull request Jul 22, 2024
## Problem

This test would occasionally fail its metric check. This could happen in
the rare case that the nodes had all been restarted before their most
recent eviction.

The metric check was added in
#8348

## Summary of changes

- Check metrics before each restart, accumulate into a bool that we
assert on at the end of the test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt c/storage/safekeeper Component: storage: safekeeper
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants