Skip to content

Commit

Permalink
Remove store-gateway rejecting gate experiment (#3706)
Browse files Browse the repository at this point in the history
* Remove store-gateway rejecting gate experiment

Remove the ability to reject new requests once the number of in-flight
object storage requests exceeds a configured limit. This setting didn't
prove to be any help preventing store-gateway OOM kills.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Changelog

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

* Update CHANGELOG.md

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
  • Loading branch information
56quarters and pracucci committed Dec 13, 2022
1 parent c275dbd commit 5b65335
Show file tree
Hide file tree
Showing 7 changed files with 3 additions and 30 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

### Grafana Mimir

* [CHANGE] Store-gateway: Remove experimental `-blocks-storage.bucket-store.max-concurrent-reject-over-limit` flag. #3706
* [FEATURE] Store-gateway: streaming of series. The store-gateway can now stream results back to the querier instead of buffering them. This is expected to greatly reduce peak memory consumption while keeping latency the same. You can enable this feature by setting `-blocks-storage.bucket-store.batch-series-size` to a value in the high thousands (5000-10000). This is still an experimental feature and is subject to a changing API and instability. #3540 #3546 #3587 #3620 #3645 #3355
* [ENHANCEMENT] Added new metric `thanos_shipper_last_successful_upload_time`: Unix timestamp (in seconds) of the last successful TSDB block uploaded to the bucket. #3627
* [ENHANCEMENT] Ruler: Added `-ruler.alertmanager-client.tls-enabled` configuration for alertmanager client. #3432 #3597
Expand Down
11 changes: 0 additions & 11 deletions cmd/mimir/config-descriptor.json
Original file line number Diff line number Diff line change
Expand Up @@ -5694,17 +5694,6 @@
"fieldValue": null,
"fieldDefaultValue": null
},
{
"kind": "field",
"name": "max_concurrent_reject_over_limit",
"required": false,
"desc": "True to reject queries above the max number of concurrent queries to execute against long-term storage. If false, queries will block until they are able to run.",
"fieldValue": null,
"fieldDefaultValue": false,
"fieldFlag": "blocks-storage.bucket-store.max-concurrent-reject-over-limit",
"fieldType": "boolean",
"fieldCategory": "experimental"
},
{
"kind": "field",
"name": "streaming_series_batch_size",
Expand Down
2 changes: 0 additions & 2 deletions cmd/mimir/help-all.txt.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -347,8 +347,6 @@ Usage of ./cmd/mimir/mimir:
Max size - in bytes - of a chunks pool, used to reduce memory allocations. The pool is shared across all tenants. 0 to disable the limit. (default 2147483648)
-blocks-storage.bucket-store.max-concurrent int
Max number of concurrent queries to execute against the long-term storage. The limit is shared across all tenants. (default 100)
-blocks-storage.bucket-store.max-concurrent-reject-over-limit
[experimental] True to reject queries above the max number of concurrent queries to execute against long-term storage. If false, queries will block until they are able to run.
-blocks-storage.bucket-store.meta-sync-concurrency int
Number of Go routines to use when syncing block meta files from object storage per tenant. (default 20)
-blocks-storage.bucket-store.metadata-cache.backend string
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,6 @@ The following features are currently experimental:
- Store-gateway
- `-blocks-storage.bucket-store.index-header.map-populate-enabled`
- `-blocks-storage.bucket-store.index-header.stream-reader-enabled`
- `-blocks-storage.bucket-store.max-concurrent-reject-over-limit`
- `-blocks-storage.bucket-store.batch-series-size`
- Blocks Storage, Alertmanager, and Ruler support for partitioning access to the same storage bucket
- `-alertmanager-storage.storage-prefix`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3021,12 +3021,6 @@ bucket_store:
# CLI flag: -blocks-storage.bucket-store.index-header.stream-reader-enabled
[stream_reader_enabled: <boolean> | default = false]

# (experimental) True to reject queries above the max number of concurrent
# queries to execute against long-term storage. If false, queries will block
# until they are able to run.
# CLI flag: -blocks-storage.bucket-store.max-concurrent-reject-over-limit
[max_concurrent_reject_over_limit: <boolean> | default = false]

# (experimental) If larger than 0, this option enables store-gateway series
# streaming. The store-gateway will load series from the bucket in batches
# instead of buffering them all in memory before returning to the querier.
Expand Down
5 changes: 1 addition & 4 deletions pkg/storage/tsdb/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -298,9 +298,7 @@ type BucketStoreConfig struct {
// Controls experimental options for index-header file reading.
IndexHeader indexheader.Config `yaml:"index_header" category:"experimental"`

// Controls what to do when MaxConcurrent is exceeded: fail immediately or wait for a slot to run.
MaxConcurrentRejectOverLimit bool `yaml:"max_concurrent_reject_over_limit" category:"experimental"`
StreamingBatchSize int `yaml:"streaming_series_batch_size" category:"experimental"`
StreamingBatchSize int `yaml:"streaming_series_batch_size" category:"experimental"`
}

// RegisterFlags registers the BucketStore flags
Expand All @@ -318,7 +316,6 @@ func (cfg *BucketStoreConfig) RegisterFlags(f *flag.FlagSet) {
f.IntVar(&cfg.ChunkPoolMaxBucketSizeBytes, "blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes", ChunkPoolDefaultMaxBucketSize, "Size - in bytes - of the largest chunks pool bucket.")
f.Uint64Var(&cfg.SeriesHashCacheMaxBytes, "blocks-storage.bucket-store.series-hash-cache-max-size-bytes", uint64(1*units.Gibibyte), "Max size - in bytes - of the in-memory series hash cache. The cache is shared across all tenants and it's used only when query sharding is enabled.")
f.IntVar(&cfg.MaxConcurrent, "blocks-storage.bucket-store.max-concurrent", 100, "Max number of concurrent queries to execute against the long-term storage. The limit is shared across all tenants.")
f.BoolVar(&cfg.MaxConcurrentRejectOverLimit, "blocks-storage.bucket-store.max-concurrent-reject-over-limit", false, "True to reject queries above the max number of concurrent queries to execute against long-term storage. If false, queries will block until they are able to run.")
f.IntVar(&cfg.TenantSyncConcurrency, "blocks-storage.bucket-store.tenant-sync-concurrency", 10, "Maximum number of concurrent tenants synching blocks.")
f.IntVar(&cfg.BlockSyncConcurrency, "blocks-storage.bucket-store.block-sync-concurrency", 20, "Maximum number of concurrent blocks synching per tenant.")
f.IntVar(&cfg.MetaSyncConcurrency, "blocks-storage.bucket-store.meta-sync-concurrency", 20, "Number of Go routines to use when syncing block meta files from object storage per tenant.")
Expand Down
7 changes: 1 addition & 6 deletions pkg/storegateway/bucket_stores.go
Original file line number Diff line number Diff line change
Expand Up @@ -91,12 +91,7 @@ func NewBucketStores(cfg tsdb.BlocksStorageConfig, shardingStrategy ShardingStra

// The number of concurrent queries against the tenants BucketStores are limited.
queryGateReg := prometheus.WrapRegistererWithPrefix("cortex_bucket_stores_", reg)
var queryGate gate.Gate
if cfg.BucketStore.MaxConcurrentRejectOverLimit {
queryGate = gate.NewRejecting(cfg.BucketStore.MaxConcurrent)
} else {
queryGate = gate.NewBlocking(cfg.BucketStore.MaxConcurrent)
}
queryGate := gate.NewBlocking(cfg.BucketStore.MaxConcurrent)
queryGate = gate.NewInstrumented(queryGateReg, cfg.BucketStore.MaxConcurrent, queryGate)

u := &BucketStores{
Expand Down

0 comments on commit 5b65335

Please sign in to comment.