Long Term Storage Improvements [Tracking Issue] #1705

bwplotka · 2019-11-01T12:59:38Z

This is the "index" issue to help to track issues, initiatives, and ideas to try that might improve the usage of long term storage of metrics for both read and storage part. It currently works, but there are many things we can improve. The goal of this ticket is to be more clear and give some overview of what’s happening and compare other potential improvement ideas! Targetted mostly to contributors who want to help us with some challenging problems.

Overall we want to encourage more collaboration and contributions on this! So please jump on anything interesting and propose new ideas! (:

Let's keep the discussions about each idea in separate GitHub issue, if no GitHub issue is created, please create one and link it here if it relates to the problems we want to solve! I will try to have this issue updated once we progress.

Store Gateway Syncing Blocks.

Things to improve

Reduce startup network traffic
Make startup faster
Reduce memory usage during startup
Eventual consistency

Overview

Click to see the current logic

The syncing might be the most impactful during startup, especially with empty local store Gateway directory, however, it is also performed every 3m minutes interval (which is configurable by flag). This means that any improvements in syncing will improve both startup and overall baseline memory used.

The main goal of the block sync process is to allow Store Gateway to access the data in the form of blocks from object storage and allow returning it to Querier on Series gRPC request. The process looks as follows:

Iterate over all blocks in object storage:
For every not seen before block:
- Download meta.json
  - skip if younger than consistency delay
  - skip if outside of time partition
  - skip if relabelling ignores the block (sharding)
- Check if index-cache.json is present on the disk. IF not:
  - Download whole index file, mmap all of it (!)
  - Calculates index-cache.json
- Load whole index-cache to memory
- Remove non-existing blocks’ index-cache (index-header let’s call it) JSON files from disk.

The mentioned index-cache.json (index-header) holds block’s:

TOC
Symbols
LabelValues:
- Calculated from all LabesIndicesTable, that each points to LabelIndex entry.
- Used for matching against request selectors.
Postings Offsets

Initiatives/Ideas

Switch to different format for disk index-cache issue
- e.g protobuf + compression for index-cache.json: PR
- json-iterator
Work purely on symbols instead of strings, only lookup strings afterwards.
Fix regression introduced in v0.8: issue
Mmap/load index header to memory on demand
Optimize constructing index-cache from index: issue
Make index more object storage friendly (:

Querying

Things to improve

Memory used
Store GW OOMs
Querier OOMs

Overview

Click to see the current logic

So how the querying works in Thanos? The query is delivered through different components (top down):

Optional Cortex response cacher
- Time align
- Split by day (!)
- Caches response
Querier
- Performs PromQL evaluation
- Fanouts to each StoreAPI
- Merges all responses per sorted Series by Series
Store Gateway
- Chooses blocks to query based on time, external labels and resolution
- Per block:
  - Get matching postings in a partitioned way.
  - Get series pointed by those postings in a partitioned way
  - Choose chunks within each series that matches time range within the block
  - Fetch chunks in a partitioned way
- Merge data in SeriesSet

Important facts:

We partition index data fetches in every step to combine multiple object storage GetRange fetches for the same index file into bigger requests to avoid rate limiting. This blocks a bit of streaming of fetching series, postings due to partitioning.

We currently don't cache anything on a disk other than index-cache.json mentioned in the Sync section.

Initiatives/Ideas

Testability/Observability

Initiatives/Ideas

Request logger grpc middleware
Query audit log: issue prometheus
Unit test level benchmarks for both Querier and Store GW.
e2e benchmark similar to prombench issue

Stability/Maintainability

Initiatives/Ideas

Improved panic handling in run group: PR
Eventual consistency handling between store GW and writers: Proposal, Tracking issue
Smooth partial upload logic for Compactor: issue

Downsampling

Things to improve

It still causes a bit of confusion e.g the purpose, cost and usage e.g rate[X]

Initiatives/Ideas

Different query auto downsampling logic: issue
Step based auto downsampling: PR
Staleness + downsampling: issue

The text was updated successfully, but these errors were encountered:

bwplotka · 2019-11-01T19:51:50Z

Tried and rejected:

* Simple global memory limiter in Querier. Rejected. attempt

stale · 2020-01-11T03:43:42Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

GiedriusS · 2020-01-13T22:58:58Z

Not stale, very much a work in progress.

stale · 2020-02-12T23:28:48Z

This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions.

bwplotka · 2020-02-12T23:55:14Z

Let's close it. It was super not useful. Lesson learned. (: Milestones and separate issues works much better.

This was referenced Nov 1, 2019

docs: add BucketFile proposal #1567

Closed

store: Improve main pain points for using store gateway against big bucket. #814

Closed

store: Store gateway consuming lots of memory / OOMing #448

Closed

bwplotka added feature request/improvement help wanted labels Nov 1, 2019

bwplotka self-assigned this Nov 1, 2019

bwplotka mentioned this issue Nov 1, 2019

Discussion: Improve memory use for queries #1649

Closed

bwplotka changed the title ~~Long Term Storage Improvements [Tracking Issues]~~ Long Term Storage Improvements [Tracking Issue] Nov 28, 2019

stale bot added the stale label Jan 11, 2020

stale bot removed the stale label Jan 13, 2020

stale bot added the stale label Feb 12, 2020

bwplotka closed this as completed Feb 12, 2020

mapshen mentioned this issue Oct 7, 2020

Thanos Store should always prefer higher resolution data when possible #1170

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long Term Storage Improvements [Tracking Issue] #1705

Long Term Storage Improvements [Tracking Issue] #1705

bwplotka commented Nov 1, 2019 •

edited

bwplotka commented Nov 1, 2019

stale bot commented Jan 11, 2020

GiedriusS commented Jan 13, 2020

stale bot commented Feb 12, 2020

bwplotka commented Feb 12, 2020

Long Term Storage Improvements [Tracking Issue] #1705

Long Term Storage Improvements [Tracking Issue] #1705

Comments

bwplotka commented Nov 1, 2019 • edited

Store Gateway Syncing Blocks.

Overview

Initiatives/Ideas

Querying

Overview

Initiatives/Ideas

Testability/Observability

Initiatives/Ideas

Stability/Maintainability

Initiatives/Ideas

Downsampling

Initiatives/Ideas

bwplotka commented Nov 1, 2019

Tried and rejected:

stale bot commented Jan 11, 2020

GiedriusS commented Jan 13, 2020

stale bot commented Feb 12, 2020

bwplotka commented Feb 12, 2020

bwplotka commented Nov 1, 2019 •

edited