Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming store-gateway: make room for tenant ID in postings cache key #3839

Merged
merged 8 commits into from
Jan 4, 2023

Conversation

dimitarvdimitrov
Copy link
Contributor

Background + Why?

The main purpose of this PR is to make enough room in the memcached key for the tenant ID. The key in question is for storing series for a set of postings. This is only used in the streaming store-gateway implementation

Mimir supports tenant ID of up to 150 characters (I assume ASCII).

Memcached keys are limited to 250 bytes. With the current implementation we cannot fit all items in the cache key. This isn't a problem in reality since the tenant ID doesn't go anywhere near 150 bytes.

Here is how the cache key length allocation looks like on main

key component size in bytes
SP2: identifier of the cache key version 3
:: colon separators 5
block ULID 26
1_of_3: sharding selector 6-10 (assuming max 1000 shards)
tenantID max 150
postings key (32-bit hash of a slice of postings; base64 encoded) 44 (32 * 4/3 = 42.6; round to nearest higher multiple of 4 = 44)
matchers key (32-bit hash of a slice of postings; base64 encoded) 44 (32 * 4/3 = 42.6; round to nearest higher multiple of 4 = 44)
total 282
extra allocated 32

Since most of these are pretty much fixed in length, the extra 32 bytes will be taken from the tenant ID.

Changes

This PR does a few notable changes

  • removes shards from the cached entry
    • previously we'd store the shard inside the cache as well as in the cache key; on a cache hit we'd compare the requested shard with the shard inside the cache entry to detect collisions. This comparison was inherited from the other caching keys, but I don't think it brings any value.
    • keeps shard selector in the cache key
  • removes matchers from the cached entry and cache key
    • the same matchers can select the same set of postings (e.g. {a="1", b="2"} and {c="3"} can both select the same series)
    • so further separating the series cache by the matchers isn't useful
  • stores diff-encoded and snappy compressed postings within the cache entry
    • in order to detect cache collisions we now also store all the postings in the cache entry itself
    • we also store their hash in the cache key
    • storing and fetching the extra postings to/from the cache doesn't seem to be affecting CPU and memory allocations
    • so now the cache key length allocation looks like
    • key component size in bytes
      SP2: identifier of the cache key version 3
      :: colon separators 4
      block ULID 26
      1_of_3: sharding selector 6-10 (assuming max 1000 shards)
      tenantID max 150
      postings key (32-bit hash of a slice of postings; base64 encoded) 44 (32 * 4/3 = 42.6; round to nearest higher multiple of 4 = 44)
      total 237
      unallocated 13
    • I also considered increasing the hash size for postings in order to go as close to 250 as possible; using a custom hash size means that instead of using blake2b.Sum we need to do blake2b.New. The latter does extra allocations for the digest, whereas the former allocated only a slice for the hash. So I decided not to do it; we can reconsider if collisions are common

Benchmarks

There are some marginal savings due to now not having to print and hash the matchers when storing the series. The 13% slowdown in Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_default_options/10000000of10000000 is a fluke because it doesn't use any of the code changed in this PR

The baseline was 748c19e

click
name                                                                                                                                old time/op    new time/op    delta
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_default_options/10000000of10000000-10                                     303µs ± 1%     342µs ±13%  +13.07%  (p=0.016 n=4+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(10K_per_batch)/1000000of1000000-10                       1.25s ± 0%     1.29s ± 1%   +2.65%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    176ms ± 1%     180ms ± 1%   +2.25%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10     137ms ± 1%     139ms ± 1%   +1.92%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                       1.04ms ± 1%    1.05ms ± 0%   +1.25%  (p=0.016 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway/indexcache goos:darwin goarch:arm64
CanonicalPostingsKey/1000_postings-10                                                                                                 11.1µs ± 0%    11.2µs ± 0%   +0.97%  (p=0.008 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_default_options/1000000of1000000-10                                        1.66s ± 1%     1.68s ± 7%     ~     (p=0.690 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(1K_per_batch)/1000000of1000000-10                        1.28s ± 0%     1.29s ±11%     ~     (p=0.690 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_and_index_cache_(1K_per_batch)/1000000of1000000-10        1.15s ± 0%     1.14s ± 1%     ~     (p=0.114 n=4+4)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_default_options/10000000of10000000-10                                     174ms ± 1%     173ms ± 2%     ~     (p=0.548 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     135ms ± 1%     135ms ± 2%     ~     (p=1.000 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     442µs ± 6%     443µs ± 3%     ~     (p=0.421 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    482µs ±45%     401µs ± 1%     ~     (p=0.095 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10    60.8µs ± 4%    60.1µs ± 0%     ~     (p=0.151 n=5+5)
StoreCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             110µs ± 1%     110µs ± 2%     ~     (p=1.000 n=5+4)
FetchCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                        2.49ms ± 1%    2.45ms ± 5%     ~     (p=0.151 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway/indexcache goos:darwin goarch:arm64
CanonicalPostingsKey/10_postings-10                                                                                                    352ns ± 1%     349ns ± 1%     ~     (p=0.421 n=5+5)
CanonicalPostingsKey/100_postings-10                                                                                                  1.40µs ± 1%    1.41µs ± 1%     ~     (p=0.079 n=5+5)
CanonicalPostingsKey/10000_postings-10                                                                                                 107µs ± 1%     107µs ± 2%     ~     (p=0.841 n=5+5)
CanonicalPostingsKey/1000000_postings-10                                                                                              10.5ms ± 0%    10.4ms ± 0%     ~     (p=0.222 n=5+5)
CanonicalPostingsKey/100000_postings-10                                                                                               1.05ms ± 0%    1.04ms ± 0%   -0.28%  (p=0.032 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
StoreCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                        1.09ms ± 1%    1.08ms ± 1%   -0.97%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                       2.51ms ± 0%    2.42ms ± 0%   -3.67%  (p=0.016 n=5+4)
FetchCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             249µs ± 0%     238µs ± 1%   -4.19%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/without_sharding-10                                                                                       395ns ± 0%     265ns ± 0%  -32.89%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/with_sharding-10                                                                                          434ns ± 5%     271ns ± 1%  -37.50%  (p=0.008 n=5+5)

name                                                                                                                                old alloc/op   new alloc/op   delta
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
StoreCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                        1.27MB ± 0%    1.28MB ± 0%   +0.64%  (p=0.016 n=5+4)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10    81.4MB ± 1%    81.8MB ± 0%   +0.50%  (p=0.032 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_and_index_cache_(1K_per_batch)/1000000of1000000-10        807MB ± 0%     810MB ± 0%   +0.40%  (p=0.029 n=4+4)
FetchCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                        2.57MB ± 0%    2.57MB ± 0%   +0.33%  (p=0.016 n=4+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(1K_per_batch)/1000000of1000000-10                       1.15GB ± 0%    1.15GB ± 0%   +0.19%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     114MB ± 0%     114MB ± 0%   +0.18%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                       2.59MB ± 0%    2.59MB ± 0%   +0.01%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_default_options/10000000of10000000-10                                     710kB ± 0%     710kB ± 0%   +0.00%  (p=0.024 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_default_options/1000000of1000000-10                                       1.95GB ± 0%    1.95GB ± 0%     ~     (p=0.548 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(10K_per_batch)/1000000of1000000-10                      1.05GB ± 0%    1.05GB ± 0%     ~     (p=0.095 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_default_options/10000000of10000000-10                                     179MB ± 0%     179MB ± 0%     ~     (p=0.841 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    106MB ± 0%     106MB ± 0%     ~     (p=0.095 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     719kB ± 0%     721kB ± 0%     ~     (p=0.056 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    831kB ± 1%     832kB ± 0%     ~     (p=0.421 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10    21.8kB ± 2%    21.8kB ± 2%     ~     (p=1.000 n=5+5)
pkg:github.com/grafana/mimir/pkg/storegateway/indexcache goos:darwin goarch:arm64
CanonicalPostingsKey/10_postings-10                                                                                                    96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/100_postings-10                                                                                                   96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/1000_postings-10                                                                                                  96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/10000_postings-10                                                                                                 96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/100000_postings-10                                                                                                96.0B ± 0%     96.0B ± 0%     ~     (all equal)
CanonicalPostingsKey/1000000_postings-10                                                                                               96.0B ± 0%     96.0B ± 0%     ~     (all equal)
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
StoreCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                       1.31MB ± 0%    1.31MB ± 0%   -0.01%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             272kB ± 0%     272kB ± 0%   -0.01%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             131kB ± 0%     131kB ± 0%   -0.06%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/without_sharding-10                                                                                        181B ± 0%      112B ± 0%  -38.12%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/with_sharding-10                                                                                           197B ± 0%      112B ± 0%  -43.15%  (p=0.008 n=5+5)

name                                                                                                                                old allocs/op  new allocs/op  delta
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_default_options/1000000of1000000-10                                        11.0M ± 0%     11.0M ± 0%     ~     (p=0.690 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_and_index_cache_(1K_per_batch)/1000000of1000000-10        9.00M ± 0%     9.00M ± 0%     ~     (p=0.457 n=4+4)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_default_options/10000000of10000000-10                                     1.10M ± 0%     1.10M ± 0%     ~     (p=0.794 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10      901k ± 0%      901k ± 0%     ~     (p=0.571 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_default_options/10000000of10000000-10                                       802 ± 0%       802 ± 0%     ~     (all equal)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_and_index_cache_(1K_per_batch)/10000000of10000000-10       352 ± 0%       352 ± 0%     ~     (all equal)
pkg:github.com/grafana/mimir/pkg/storegateway/indexcache goos:darwin goarch:arm64
CanonicalPostingsKey/10_postings-10                                                                                                     2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/100_postings-10                                                                                                    2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/1000_postings-10                                                                                                   2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/10000_postings-10                                                                                                  2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/100000_postings-10                                                                                                 2.00 ± 0%      2.00 ± 0%     ~     (all equal)
CanonicalPostingsKey/1000000_postings-10                                                                                                2.00 ± 0%      2.00 ± 0%     ~     (all equal)
pkg:github.com/grafana/mimir/pkg/storegateway goos:darwin goarch:arm64
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(10K_per_batch)/1000000of1000000-10                       10.0M ± 0%     10.0M ± 0%   -0.00%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                    1.00M ± 0%     1.00M ± 0%   -0.00%  (p=0.016 n=5+5)
Bucket_Series_WithSkipChunks/100000SeriesWith100Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                     1.01M ± 0%     1.01M ± 0%   -0.03%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/1000000SeriesWith1Samples/with_series_streaming_(1K_per_batch)/1000000of1000000-10                        10.1M ± 0%     10.1M ± 0%   -0.03%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                         6.02k ± 0%     6.02k ± 0%   -0.05%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                        6.02k ± 0%     6.02k ± 0%   -0.05%  (p=0.008 n=5+5)
FetchCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                             1.02k ± 0%     1.02k ± 0%   -0.29%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(1K_per_batch)/10000000of10000000-10                       664 ± 0%       652 ± 0%   -1.81%  (p=0.008 n=5+5)
Bucket_Series_WithSkipChunks/1SeriesWith10000000Samples/with_series_streaming_(10K_per_batch)/10000000of10000000-10                      662 ± 0%       650 ± 0%   -1.81%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/with_sharding-10                                                                                           6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/without_sharding-10                                                                                        6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/6000_series_with_6_labels_each-10                                                                          6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions-10                                                         6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)
StoreCachedSeriesForPostings/1000_series_with_1_matcher-10                                                                              6.00 ± 0%      3.00 ± 0%  -50.00%  (p=0.008 n=5+5)

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
@dimitarvdimitrov
Copy link
Contributor Author

i also did a comparison of the size of entries in memcached before and after this change

on main: (series + shard index + shard count) | protobufMarshal | snappy

on this PR: (series + (postings | diffEncode | snappy)) | protobufMarshal | snappy

these are the common benchmarks between the two. It looks like postings aren't adding almost anything. Could be because of the way they are generated in tests. Not sure if I should investigate

BenchmarkFetchCachedSeriesForPostings/6000_series_with_6_labels_each
    length_without_postings 47686 length_with_postings 47713
BenchmarkFetchCachedSeriesForPostings/1000_series_with_1_matcher
    length_without_postings 6108  length_with_postings 6121
BenchmarkFetchCachedSeriesForPostings/6000_series_with_6_labels_with_more_repetitions
    length_without_postings 48850 length_with_postings 48877

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
@dimitarvdimitrov
Copy link
Contributor Author

@colega @pracucci can you please review this when you find time?

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, LGTM! I left few minor comments. I agree with your reasoning in the PR description.

pkg/storegateway/indexcache/memcached.go Show resolved Hide resolved
pkg/storegateway/series_refs.go Show resolved Hide resolved
pkg/storegateway/series_refs.go Show resolved Hide resolved
pkg/storegateway/series_refs.go Show resolved Hide resolved
pkg/storegateway/series_refs.go Show resolved Hide resolved
pkg/storegateway/series_refs.go Show resolved Hide resolved
pkg/storegateway/series_refs.go Show resolved Hide resolved
@dimitarvdimitrov
Copy link
Contributor Author

i ran a quick test with this in our dev cluster. The test scenario was running many requests selecting ~50K series each without chunks. zone-a was running r218 with streaming and without mmap, zone-b was running with streaming and without mmap and with the changes in this PR, zone-c was running the default implementations

i didn't observe any negative impact on CPU or memory utilization, or latency

Screenshot 2023-01-03 at 14 45 32
Screenshot 2023-01-03 at 14 45 47

@pracucci
Copy link
Collaborator

pracucci commented Jan 3, 2023

@dimitarvdimitrov Thanks! Looks great. Can you confirm the store-gateway-load-test results comparison hasn't reported any issue, please?

@dimitarvdimitrov
Copy link
Contributor Author

the store-gateway-load-test results weren't affected. And also SeriesForPostings cache hit rate was ~100% for zone-b (means that cache was actually used)
Screenshot 2023-01-03 at 16 47 10

@pracucci
Copy link
Collaborator

pracucci commented Jan 4, 2023

Let's gooooo! 😂

@pracucci pracucci merged commit 4a80c31 into main Jan 4, 2023
@pracucci pracucci deleted the dimitar/streaming-series-caching-keys branch January 4, 2023 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants