Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: jitter secondary periods #7544

Merged
merged 2 commits into from
May 3, 2024
Merged

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented Apr 29, 2024

Problem

After some time the load from heatmap uploads gets rather spiky. They're unintentionally synchronising.

Chart (does this make a boing sound in anyone else's head?):
image

Summary of changes

  • Add a helper period_jitter and apply a 5% jitter from downloader and heatmap_uploader when updating the next runtime at the end of an interation.
  • Refactor existing places that we pick a startup interval into period_warmup, so that the intent is obvious.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added c/storage/pageserver Component: storage: pageserver a/tech_debt Area: related to tech debt labels Apr 29, 2024
@jcsp jcsp requested a review from a team as a code owner April 29, 2024 16:53
@jcsp jcsp requested a review from koivunej April 29, 2024 16:53
Copy link

github-actions bot commented Apr 29, 2024

2868 tests run: 2747 passed, 0 failed, 121 skipped (full report)


Flaky tests (2)

Postgres 15

  • test_vm_bit_clear_on_heap_lock: debug

Postgres 14

  • test_partial_evict_tenant[relative_spare]: release

Code coverage* (full report)

  • functions: 28.0% (6618 of 23601 functions)
  • lines: 46.7% (47046 of 100674 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
093dcb3 at 2024-05-03T12:46:57.409Z :recycle:

Copy link
Contributor

@koivunej koivunej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Difficult to say if these changes will actually get rid of the problem because of me not fully understanding the scheduling framework framework, but cannot see how these could hurt.

@jcsp jcsp enabled auto-merge (squash) May 3, 2024 11:56
@jcsp jcsp merged commit 8b4dd5d into main May 3, 2024
48 of 49 checks passed
@jcsp jcsp deleted the jcsp/pageserver-secondary-jitter branch May 3, 2024 12:31
conradludgate pushed a commit that referenced this pull request May 8, 2024
## Problem

After some time the load from heatmap uploads gets rather spiky. They're
unintentionally synchronising.

Chart (does this make a _boing_ sound in anyone else's head?):

![image](https://github.com/neondatabase/neon/assets/944640/18829fc8-c5b7-4739-9a9b-491b5d6fcade)


## Summary of changes

- Add a helper `period_jitter` and apply a 5% jitter from downloader and
heatmap_uploader when updating the next runtime at the end of an
interation.
- Refactor existing places that we pick a startup interval into
`period_warmup`, so that the intent is obvious.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants