Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store-gateway: Use a pooling memory allocator for loading chunks from memcached #3772

Closed
4 of 7 tasks
56quarters opened this issue Dec 19, 2022 · 7 comments
Closed
4 of 7 tasks

Comments

@56quarters
Copy link
Contributor

56quarters commented Dec 19, 2022

Store-gateways spend a significant portion of time reading data from memcached. Due to the way the memcached client used works, each read requires the client to allocate enough space for the data read. This memory is thread thrown away after the request is finished. This creates a lot of garbage and extra work for something only briefly used.

Modeling an approach based on the work done in #3756, we should attempt to pool memory used for reading from the chunks cache to reduce garbage generated and improve request speed.

From a dev cell, ~1/3 of memory allocated is by the memcached client:
Screenshot 2022-12-19 at 13-28-38 Explore - Phlare (fire-dev-001) - Grafana


Tasks

@56quarters
Copy link
Contributor Author

56quarters commented Dec 19, 2022

Some initial notes:

  • It makes sense that the allocator is passed to caching methods as a parameter: the caching layer doesn't know about the lifecycle of the objects it's working with. Adding cleanup methods to the caching layer (ala .Put([]byte)) would be hugely invasive.
  • In terms of caching, the chunks memcached instances do far more reads (in terms of bandwidth) than index, results, or metadata caches. Thus we should focus on that first within Mimir.
  • This will require changes in the gomemcache client, and the dskit wrappers.
  • We should probably remove the pooling in gomemcache added here to avoid confusion. Better to have only a single way of doing something.

@charleskorn
Copy link
Contributor

We should probably remove the pooling in gomemcache grafana/gomemcache#5 to avoid confusion. Better to have only a single way of doing something.

Are you thinking we remove the Pool field from Client and instead pass a Pool (or Allocator or whatever we call it) instance to Get(), GetMulti() etc.? Or something else?

If the former, how many other projects use this client? Do we care about making a breaking change like this?

@pracucci
Copy link
Collaborator

We should probably remove the pooling in gomemcache grafana/gomemcache#5 to avoid confusion. Better to have only a single way of doing something.

We should talk to Bryan and other projects using that fork (Mimir currently use an older commit not including it). I think that pooling would be very harder to use for the Mimir case, instead of passing the allocator directly to GetMulti() as a functional parameter.

@56quarters
Copy link
Contributor Author

We should probably remove the pooling in gomemcache grafana/gomemcache#5 to avoid confusion. Better to have only a single way of doing something.

We should talk to Bryan and other projects using that fork (Mimir currently use an older commit not including it). I think that pooling would be very harder to use for the Mimir case, instead of passing the allocator directly to GetMulti() as a functional parameter.

Yeah, let's definitely talk to the other projects using this. I'm not sure if anyone in Grafana is using the pooling behavior yet so it might be pretty easy to remove without disruption.

My $0.02 is that the lifecycle of the memory being used for cache results is very request-oriented so it makes more sense to have the caller control the lifecycle of the memory instead of needing to return pooled bytes to the memecached client. The client will end up hidden behind many layers of abstraction (dskit wrappers, store-gateway wrappers). Additionally, doing via passing the allocator (or whatever we call it) allows us to do arena-style memory management.

@pracucci
Copy link
Collaborator

My $0.02 is that the lifecycle of the memory being used for cache results is very request-oriented so it makes more sense to have the caller control the lifecycle of the memory instead of needing to return pooled bytes to the memecached client.

💯

@bboreham
Copy link
Contributor

This idea was raised in #285.
#2831 is an earlier attempt to use it in store-gateway.

56quarters added a commit to grafana/dskit that referenced this issue Jan 4, 2023
This change adds the ability to pass a variable number of Option
callbacks to read cache calls. Currently, this allows the underlying
cache client to make use of a caller supplied memory allocator. Only
the memcached client does so at the moment.

See grafana/mimir#3772

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
56quarters added a commit to grafana/dskit that referenced this issue Jan 5, 2023
This change adds the ability to pass a memory allocator to caching
methods via their context argument. This allows the underlying client
to use memory allocators better suited to their workloads than GC.
This is only used by the Memcached client at the moment.

See grafana/mimir#3772

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
56quarters added a commit to grafana/dskit that referenced this issue Jan 9, 2023
This change adds the ability to pass a memory allocator to caching methods
using one or more Option arguments. This allows the underlying client to use
memory allocators better suited to their workloads than GC. This is only used
by the Memcached client at the moment.

See grafana/mimir#3772

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
56quarters added a commit that referenced this issue Jan 9, 2023
This commit of dskit adds the ability to pass per-call options
to cache calls. This allows callers to use specific memory
allocators better suited to their workloads than GC.

See #3772

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
pracucci pushed a commit that referenced this issue Jan 10, 2023
This commit of dskit adds the ability to pass per-call options
to cache calls. This allows callers to use specific memory
allocators better suited to their workloads than GC.

See #3772

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
@56quarters
Copy link
Contributor Author

56quarters commented Jan 17, 2023

This is incidentally being done as part of #3968. I will focus on testing memory pooling with the index cache. Completed in #4074 while #3968 is being tuned and worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants