Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add memory pooling for chunks fetched from cache when fine-grained caching is enabled #4255

Merged
merged 1 commit into from
Mar 7, 2023

Conversation

pracucci
Copy link
Collaborator

@pracucci pracucci commented Feb 20, 2023

What this PR does

We've just rolled out the fine-grained chunks fetching and caching in zone-a of a dev cluster and all metrics are better than other 2 zones (where the new feature is disabled). This is good!

However, I noticed that the allocated bytes / sec in zone-a is a bit higher compared to other zones. After some profiling, looks like the main difference is given by memcache.Client.GetMulti():

Screenshot 2023-02-20 at 15 00 50

The old implementation uses a memory pool for the data fetched from the cache, while the new one doesn't. I think (but not sure) that @dimitarvdimitrov experimented with it and didn't see any real benefit. However, I would like to challenge it and experiment again with a memory pool (see my test results in #4255 (comment)).

Which issue(s) this PR fixes or relates to

N/A

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@pracucci
Copy link
Collaborator Author

I've deployed it for 75 minutes in a dev cluster under normal load (no load testing running):

  • zone-a: main version, fine-grained chunks caching enabled
  • zone-b: this PR, fine-grained chunks caching enabled

Screenshot 2023-02-20 at 16 33 03

We can also see the impact of the memory pooling looking at continuous profiling:

Screenshot 2023-02-20 at 16 36 16

@pracucci pracucci marked this pull request as ready for review February 20, 2023 15:37
@pracucci pracucci requested a review from a team as a code owner February 20, 2023 15:37
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All PRs should come with graphs.

@pracucci
Copy link
Collaborator Author

Since I'm not in a hurry, I will wait for Dimitar to come back from PTO. I would like his opinion too.

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The problem I saw when using a pool was that with requests with only a few series the slab size was too large and caused extra heap usage. But for most of the time I was running with too large slab size (1.6M)

In the scenario you ran the requests were returning on average between 200 and 400 series. I think that will qualify as few series.

CHANGELOG.md Outdated Show resolved Hide resolved
pkg/storegateway/series_chunks.go Outdated Show resolved Hide resolved
pkg/storegateway/series_chunks.go Outdated Show resolved Hide resolved
…ching is enabled

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
@pracucci pracucci force-pushed the add-pooling-to-chunks-fetched-from-cache branch from 3094e58 to 6f7e652 Compare March 7, 2023 15:55
@pracucci pracucci enabled auto-merge (squash) March 7, 2023 16:02
@pracucci pracucci merged commit 7179166 into main Mar 7, 2023
@pracucci pracucci deleted the add-pooling-to-chunks-fetched-from-cache branch March 7, 2023 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants