Streaming store-gateway: make room for tenant ID in postings cache key #3839

dimitarvdimitrov · 2023-01-02T18:25:41Z

Background + Why?

The main purpose of this PR is to make enough room in the memcached key for the tenant ID. The key in question is for storing series for a set of postings. This is only used in the streaming store-gateway implementation

Mimir supports tenant ID of up to 150 characters (I assume ASCII).

Memcached keys are limited to 250 bytes. With the current implementation we cannot fit all items in the cache key. This isn't a problem in reality since the tenant ID doesn't go anywhere near 150 bytes.

Here is how the cache key length allocation looks like on main

key component	size in bytes
`SP2`: identifier of the cache key version	3
`:`: colon separators	5
block ULID	26
`1_of_3`: sharding selector	6-10 (assuming max 1000 shards)
tenantID	max 150
postings key (32-bit hash of a slice of postings; base64 encoded)	44 (32 * 4/3 = 42.6; round to nearest higher multiple of 4 = 44)
matchers key (32-bit hash of a slice of postings; base64 encoded)	44 (32 * 4/3 = 42.6; round to nearest higher multiple of 4 = 44)
total	282
extra allocated	32

Since most of these are pretty much fixed in length, the extra 32 bytes will be taken from the tenant ID.

Changes

This PR does a few notable changes

removes shards from the cached entry
- previously we'd store the shard inside the cache as well as in the cache key; on a cache hit we'd compare the requested shard with the shard inside the cache entry to detect collisions. This comparison was inherited from the other caching keys, but I don't think it brings any value.
- keeps shard selector in the cache key
removes matchers from the cached entry and cache key
- the same matchers can select the same set of postings (e.g. {a="1", b="2"} and {c="3"} can both select the same series)
- so further separating the series cache by the matchers isn't useful

stores diff-encoded and snappy compressed postings within the cache entry

in order to detect cache collisions we now also store all the postings in the cache entry itself
we also store their hash in the cache key
storing and fetching the extra postings to/from the cache doesn't seem to be affecting CPU and memory allocations
so now the cache key length allocation looks like

key component	size in bytes
`SP2`: identifier of the cache key version	3
`:`: colon separators	4
block ULID	26
`1_of_3`: sharding selector	6-10 (assuming max 1000 shards)
tenantID	max 150
postings key (32-bit hash of a slice of postings; base64 encoded)	44 (32 * 4/3 = 42.6; round to nearest higher multiple of 4 = 44)
total	237
unallocated	13

I also considered increasing the hash size for postings in order to go as close to 250 as possible; using a custom hash size means that instead of using blake2b.Sum we need to do blake2b.New. The latter does extra allocations for the digest, whereas the former allocated only a slice for the hash. So I decided not to do it; we can reconsider if collisions are common

Benchmarks

There are some marginal savings due to now not having to print and hash the matchers when storing the series. The 13% slowdown in Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_default_options/10000000of10000000 is a fluke because it doesn't use any of the code changed in this PR

The baseline was 748c19e

click

name                                                                                                                                old time/op    new time/op    delta
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_default_options/10000000of10000000-10                                     303µs ± 1%     342µs ±13%  +13.07%  (p=0.016 n=4+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(10K_per_batch)/1000000of1000000-10                       1.25s ± 0%     1.29s ± 1%   +2.65%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    176ms ± 1%     180ms ± 1%   +2.25%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10     137ms ± 1%     139ms ± 1%   +1.92%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                       1.04ms ± 1%    1.05ms ± 0%   +1.25%  (p=0.016 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway/indexcache goos:darwin goarch:arm64
CanonicalPostingsKey/1000_postings-10                                                                                                 11.1µs ± 0%    11.2µs ± 0%   +0.97%  (p=0.008 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_default_options/1000000of1000000-10                                        1.66s ± 1%     1.68s ± 7%     ~     (p=0.690 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(1K_per_batch)/1000000of1000000-10                        1.28s ± 0%     1.29s ±11%     ~     (p=0.690 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_and_index_cache_(1K_per_batch)/1000000of1000000-10        1.15s ± 0%     1.14s ± 1%     ~     (p=0.114 n=4+4)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_default_options/10000000of10000000-10                                     174ms ± 1%     173ms ± 2%     ~     (p=0.548 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     135ms ± 1%     135ms ± 2%     ~     (p=1.000 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     442µs ± 6%     443µs ± 3%     ~     (p=0.421 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    482µs ±45%     401µs ± 1%     ~     (p=0.095 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10    60.8µs ± 4%    60.1µs ± 0%     ~     (p=0.151 n=5+5)
StoreCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             110µs ± 1%     110µs ± 2%     ~     (p=1.000 n=5+4)
FetchCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                        2.49ms ± 1%    2.45ms ± 5%     ~     (p=0.151 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway/indexcache goos:darwin goarch:arm64
CanonicalPostingsKey/10_postings-10                                                                                                    352ns ± 1%     349ns ± 1%     ~     (p=0.421 n=5+5)
CanonicalPostingsKey/100_postings-10                                                                                                  1.40µs ± 1%    1.41µs ± 1%     ~     (p=0.079 n=5+5)
CanonicalPostingsKey/10000_postings-10                                                                                                 107µs ± 1%     107µs ± 2%     ~     (p=0.841 n=5+5)
CanonicalPostingsKey/1000000_postings-10                                                                                              10.5ms ± 0%    10.4ms ± 0%     ~     (p=0.222 n=5+5)
CanonicalPostingsKey/100000_postings-10                                                                                               1.05ms ± 0%    1.04ms ± 0%   -0.28%  (p=0.032 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
StoreCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                        1.09ms ± 1%    1.08ms ± 1%   -0.97%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                       2.51ms ± 0%    2.42ms ± 0%   -3.67%  (p=0.016 n=5+4)
FetchCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             249µs ± 0%     238µs ± 1%   -4.19%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/without_sharding-10                                                                                       395ns ± 0%     265ns ± 0%  -32.89%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/with_sharding-10                                                                                          434ns ± 5%     271ns ± 1%  -37.50%  (p=0.008 n=5+5)

name                                                                                                                                old alloc/op   new alloc/op   delta
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
StoreCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                        1.27MB ± 0%    1.28MB ± 0%   +0.64%  (p=0.016 n=5+4)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10    81.4MB ± 1%    81.8MB ± 0%   +0.50%  (p=0.032 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_and_index_cache_(1K_per_batch)/1000000of1000000-10        807MB ± 0%     810MB ± 0%   +0.40%  (p=0.029 n=4+4)
FetchCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                        2.57MB ± 0%    2.57MB ± 0%   +0.33%  (p=0.016 n=4+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(1K_per_batch)/1000000of1000000-10                       1.15GB ± 0%    1.15GB ± 0%   +0.19%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     114MB ± 0%     114MB ± 0%   +0.18%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                       2.59MB ± 0%    2.59MB ± 0%   +0.01%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_default_options/10000000of10000000-10                                     710kB ± 0%     710kB ± 0%   +0.00%  (p=0.024 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_default_options/1000000of1000000-10                                       1.95GB ± 0%    1.95GB ± 0%     ~     (p=0.548 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(10K_per_batch)/1000000of1000000-10                      1.05GB ± 0%    1.05GB ± 0%     ~     (p=0.095 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_default_options/10000000of10000000-10                                     179MB ± 0%     179MB ± 0%     ~     (p=0.841 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    106MB ± 0%     106MB ± 0%     ~     (p=0.095 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     719kB ± 0%     721kB ± 0%     ~     (p=0.056 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    831kB ± 1%     832kB ± 0%     ~     (p=0.421 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10    21.8kB ± 2%    21.8kB ± 2%     ~     (p=1.000 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway/indexcache goos:darwin goarch:arm64
CanonicalPostingsKey/10_postings-10                                                                                                    96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/100_postings-10                                                                                                   96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/1000_postings-10                                                                                                  96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/10000_postings-10                                                                                                 96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/100000_postings-10                                                                                                96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/1000000_postings-10                                                                                               96.0B ± 0%     96.0B ± 0%     ~     (all equal)
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
StoreCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                       1.31MB ± 0%    1.31MB ± 0%   -0.01%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             272kB ± 0%     272kB ± 0%   -0.01%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             131kB ± 0%     131kB ± 0%   -0.06%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/without_sharding-10                                                                                        181B ± 0%      112B ± 0%  -38.12%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/with_sharding-10                                                                                           197B ± 0%      112B ± 0%  -43.15%  (p=0.008 n=5+5)

name                                                                                                                                old allocs/op  new allocs/op  delta
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_default_options/1000000of1000000-10                                        11.0M ± 0%     11.0M ± 0%     ~     (p=0.690 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_and_index_cache_(1K_per_batch)/1000000of1000000-10        9.00M ± 0%     9.00M ± 0%     ~     (p=0.457 n=4+4)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_default_options/10000000of10000000-10                                     1.10M ± 0%     1.10M ± 0%     ~     (p=0.794 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10      901k ± 0%      901k ± 0%     ~     (p=0.571 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_default_options/10000000of10000000-10                                       802 ± 0%       802 ± 0%     ~     (all equal)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10       352 ± 0%       352 ± 0%     ~     (all equal)
pkg:github.com/grafana/mimir/pkg/storegateway/indexcache goos:darwin goarch:arm64
CanonicalPostingsKey/10_postings-10                                                                                                     2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/100_postings-10                                                                                                    2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/1000_postings-10                                                                                                   2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/10000_postings-10                                                                                                  2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/100000_postings-10                                                                                                 2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/1000000_postings-10                                                                                                2.00 ± 0%      2.00 ± 0%     ~     (all equal)
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(10K_per_batch)/1000000of1000000-10                       10.0M ± 0%     10.0M ± 0%   -0.00%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    1.00M ± 0%     1.00M ± 0%   -0.00%  (p=0.016 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     1.01M ± 0%     1.01M ± 0%   -0.03%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(1K_per_batch)/1000000of1000000-10                        10.1M ± 0%     10.1M ± 0%   -0.03%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                         6.02k ± 0%     6.02k ± 0%   -0.05%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                        6.02k ± 0%     6.02k ± 0%   -0.05%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             1.02k ± 0%     1.02k ± 0%   -0.29%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                       664 ± 0%       652 ± 0%   -1.81%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                      662 ± 0%       650 ± 0%   -1.81%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/with_sharding-10                                                                                           6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/without_sharding-10                                                                                        6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                          6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                         6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                              6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov · 2023-01-02T18:34:52Z

i also did a comparison of the size of entries in memcached before and after this change

on main: (series + shard index + shard count) | protobufMarshal | snappy

on this PR: (series + (postings | diffEncode | snappy)) | protobufMarshal | snappy

these are the common benchmarks between the two. It looks like postings aren't adding almost anything. Could be because of the way they are generated in tests. Not sure if I should investigate

BenchmarkFetchCachedSeriesForPostings/6000_series_with_6_labels_each
    length_without_postings 47686 length_with_postings 47713
BenchmarkFetchCachedSeriesForPostings/1000_series_with_1_matcher
    length_without_postings 6108  length_with_postings 6121
BenchmarkFetchCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions
    length_without_postings 48850 length_with_postings 48877

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov · 2023-01-02T19:12:30Z

@colega @pracucci can you please review this when you find time?

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

pracucci

Good job, LGTM! I left few minor comments. I agree with your reasoning in the PR description.

pkg/storegateway/indexcache/memcached.go

pkg/storegateway/series_refs.go

dimitarvdimitrov · 2023-01-03T13:50:13Z

i ran a quick test with this in our dev cluster. The test scenario was running many requests selecting ~50K series each without chunks. zone-a was running r218 with streaming and without mmap, zone-b was running with streaming and without mmap and with the changes in this PR, zone-c was running the default implementations

i didn't observe any negative impact on CPU or memory utilization, or latency

pracucci · 2023-01-03T13:55:06Z

@dimitarvdimitrov Thanks! Looks great. Can you confirm the store-gateway-load-test results comparison hasn't reported any issue, please?

dimitarvdimitrov · 2023-01-03T15:48:06Z

the store-gateway-load-test results weren't affected. And also SeriesForPostings cache hit rate was ~100% for zone-b (means that cache was actually used)

pracucci · 2023-01-04T07:22:21Z

Let's gooooo! 😂

dimitarvdimitrov added 6 commits January 2, 2023 16:30

Add postings to cache entry

8321658

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Remove shard from cache entry

3dbe9ff

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Remove increased hash size

caeda31

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Remove commented debug line

8a73885

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Remove extra comment in memcached.go

20875b5

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Run BenchmarkFetchCachedSeriesForPostings with postings

69cc728

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov added the component/store-gateway label Jan 2, 2023

dimitarvdimitrov requested a review from a team as a code owner January 2, 2023 18:25

Change file mode

e20beaf

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov requested review from colega and pracucci January 2, 2023 19:11

Add CHANGELOG.md entry

823753a

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

pracucci approved these changes Jan 3, 2023

View reviewed changes

pracucci merged commit 4a80c31 into main Jan 4, 2023

pracucci deleted the dimitar/streaming-series-caching-keys branch January 4, 2023 07:22

dimitarvdimitrov mentioned this pull request Jan 4, 2023

store-gateway series caching: address comments from PR 3839 #3845

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming store-gateway: make room for tenant ID in postings cache key #3839

Streaming store-gateway: make room for tenant ID in postings cache key #3839

dimitarvdimitrov commented Jan 2, 2023

dimitarvdimitrov commented Jan 2, 2023

dimitarvdimitrov commented Jan 2, 2023

pracucci left a comment

dimitarvdimitrov commented Jan 3, 2023

pracucci commented Jan 3, 2023

dimitarvdimitrov commented Jan 3, 2023

pracucci commented Jan 4, 2023

Streaming store-gateway: make room for tenant ID in postings cache key #3839

Streaming store-gateway: make room for tenant ID in postings cache key #3839

Conversation

dimitarvdimitrov commented Jan 2, 2023

Background + Why?

Changes

Benchmarks

dimitarvdimitrov commented Jan 2, 2023

dimitarvdimitrov commented Jan 2, 2023

pracucci left a comment

Choose a reason for hiding this comment

dimitarvdimitrov commented Jan 3, 2023

pracucci commented Jan 3, 2023

dimitarvdimitrov commented Jan 3, 2023

pracucci commented Jan 4, 2023