Added proposal for moving to index-header binary format. #1839

bwplotka · 2019-12-04T20:52:50Z

Some verifications in TODO, otherwise. Feedback welcome.

Definitely not a silver bullet here, but rather change in a good direction (:

Signed-off-by: Bartek Plotka bwplotka@gmail.com

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>

docs/proposals/201912_thanos_binary_index_header.md

brancz · 2019-12-05T15:33:24Z

Love the gist of this. I haven't had the chance to do a very thorough review, but I think this is only natural to do. Nice job!

GiedriusS · 2019-12-05T20:47:14Z

docs/proposals/201912_thanos_binary_index_header.md

+  * Remove building `index-cache.json` in compactor.
+Step 2: Load on demand in query time:
+  * Allow Store Gateway to load `index-header` on demand from disk in query time.
+  * Add background job to unmmap/unload `index-header` for blocks that are unused (LRU).


Not just unused but the least recently used blocks if the memory limit is violated, right?

How we would know this?

Sorry, I meant some kind of limit in terms of their size on disk because it would be really nice to avoid RAM usage blowing up in case of some queries. Having all of it in RAM is horrible, yes, but OTOH it gives the users a relatively certain thing: we know that at least that RAM usage is more or less equal over a period of time due to the retention policies being applied by the Thanos Compact component. Unless I am missing something but it seems that this is true.

docs/proposals/201912_thanos_binary_index_header.md

GiedriusS · 2019-12-05T20:53:43Z

docs/proposals/201912_thanos_binary_index_header.md

+
+TSDB index is in binary [format](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/index.md). 
+
+To allow reduced resource consumption and effort when building (1), (2), "index-header" for blocks we plan to reuse similar format for sections like symbols, label indices and 


Not sure if I have missed this detail but what is the actual format that this proposes? Protobufs?

A custom binary format was planned, composed with those sections as mentioned, however, I might find a blocker to this ):

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>

bwplotka

Thanks all for the review! Updated, with the new info, not a good news ):

bwplotka · 2019-12-06T18:11:20Z

docs/proposals/201912_thanos_binary_index_header.md

+
+## Risks
+
+### Posting size to fetch


Added this which we missed initially in this design. Let me know if it's clear.

TL;DR: we scan postings form TSDB index when building index-cache and without changing TSDB index this will be still required building index-header

@brancz @GiedriusS

We got some answers (:

pracucci · 2019-12-11T15:33:17Z

docs/proposals/201912_thanos_binary_index_header.md

+  * Check if index-cache.json is present in the bucket. If not: 
+    * Download the whole TSDB index file, mmap all of it.
+    * Build index-cache.json
+    else:


Silly thing: once this proposal will be displayed as markdown, the else: won't be formatted the way I guess you want.

pracucci · 2019-12-11T15:34:53Z

docs/proposals/201912_thanos_binary_index_header.md

+* Check if index-cache.json is present on the disk. If not: 
+  * Check if index-cache.json is present in the bucket. If not: 
+    * Download the whole TSDB index file, mmap all of it.
+    * Build index-cache.json


After this one, I would also add another point * Delete downloaded TSDB index to outline the waste of resources to download the TSDB index file just to build the cache (it won't be re-used the fully downloaded TSDB index).

pracucci · 2019-12-11T15:40:33Z

docs/proposals/201912_thanos_binary_index_header.md

+## Goals
+
+* Reduce confusion between index-cache.json and IndexCache for series and postings.
+* Do not keep large pieces of the blocks (e.g symbols, label values) in the memory for all blocks. Be able to load it quickly in query time from disk.


in query time: I guess you mean at query time. If this comment makes sense, then I would suggest you to look for in query time around the doc too.

will do thanks

pracucci · 2019-12-11T15:41:30Z

docs/proposals/201912_thanos_binary_index_header.md

+## No Goals
+
+* Removing initial startup for Thanos Store Gateway completely as designed in [Cortex, no initial block sync](https://github.com/thanos-io/thanos/issues/1813)
+  * However this proposal might a step towards that as we might be able to load index-cache/index quickly on demand from disk.


might > might be

pracucci · 2019-12-11T15:49:24Z

docs/proposals/201912_thanos_binary_index_header.md

+The process for building this will be as follows:
+
+* Thanks to https://github.com/thanos-io/thanos/pull/1792 we can check final size of index and scan for TOC file.
+* With TOC:


For clarity, I think you also need to add a new bullet point at the end for writing the index-header TOC. I guess you need it, right? Otherwise I'm not sure how you could read back from this file.

docs/proposals/201912_thanos_binary_index_header.md

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

bwplotka

All updated (:

docs/proposals/201912_thanos_binary_index_header.md

bwplotka · 2020-01-06T11:08:22Z

docs/proposals/201912_thanos_binary_index_header.md

+
+## Risks
+
+### Posting size to fetch


We got some answers (:

bwplotka · 2020-01-06T15:22:07Z

Starting implementation (:

cc @GiedriusS @pracucci if you want to take another look.

pracucci

All good to me! Can't wait to see it live!

pracucci · 2020-01-07T08:34:57Z

docs/proposals/201912_thanos_binary_index_header.md

+While idea of combing different pieces of TSDB index as our index-header is great, unfortunately we heavily rely 
+on information about size of each posting represented as `postingRange.End`. 
+
+We need to know that apriori to know how to partition and how many bytes we need to fetch from the storage to get each posting: https://github.com/thanos-io/thanos/blob/7e11afe64af0c096743a3de8a594616abf52be45/pkg/store/bucket.go#L1567


We need to know that apriori to know how to partition

I guess you mean We need to know apriori how to partition

Will fix in next PR (:

Added proposal for moving to index-header binary format.

2328da6

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>

bwplotka requested review from metalmatze, GiedriusS, brancz and squat December 4, 2019 20:52

metalmatze reviewed Dec 5, 2019

View reviewed changes

docs/proposals/201912_thanos_binary_index_header.md Outdated Show resolved Hide resolved

metalmatze reviewed Dec 5, 2019

View reviewed changes

docs/proposals/201912_thanos_binary_index_header.md Outdated Show resolved Hide resolved

metalmatze reviewed Dec 5, 2019

View reviewed changes

docs/proposals/201912_thanos_binary_index_header.md Outdated Show resolved Hide resolved

metalmatze reviewed Dec 5, 2019

View reviewed changes

docs/proposals/201912_thanos_binary_index_header.md Outdated Show resolved Hide resolved

GiedriusS reviewed Dec 5, 2019

View reviewed changes

docs/proposals/201912_thanos_binary_index_header.md Outdated Show resolved Hide resolved

GiedriusS reviewed Dec 5, 2019

View reviewed changes

docs/proposals/201912_thanos_binary_index_header.md Show resolved Hide resolved

GiedriusS reviewed Dec 5, 2019

View reviewed changes

docs/proposals/201912_thanos_binary_index_header.md Show resolved Hide resolved

GiedriusS reviewed Dec 5, 2019

View reviewed changes

bwplotka mentioned this pull request Dec 6, 2019

Series fetch issue #146

Closed

Addressed comments, added impactful risk.

dfd21c9

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>

bwplotka force-pushed the index-cache-binary branch from 9dad9bf to dfd21c9 Compare December 6, 2019 18:09

bwplotka commented Dec 6, 2019

View reviewed changes

pracucci mentioned this pull request Dec 11, 2019

Migrate from JSON to Protobuf+Snappy format for index cache #1013

Closed

pracucci reviewed Dec 11, 2019

View reviewed changes

This was referenced Dec 11, 2019

Reduce memory used by postings offset table. prometheus/prometheus#6418

Merged

index.cache.json files way bigger than common indexes #1873

Closed

pracucci reviewed Dec 13, 2019

View reviewed changes

docs/proposals/201912_thanos_binary_index_header.md Show resolved Hide resolved

bwplotka mentioned this pull request Jan 6, 2020

Please upgrade to 2.15.1 for Prometheus dependency of go Mod #1935

Closed

More comments.

3770760

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

bwplotka commented Jan 6, 2020

View reviewed changes

bwplotka mentioned this pull request Jan 6, 2020

Hide usage of index-cache.json under interface. #1943

Merged

bwplotka requested a review from GiedriusS January 6, 2020 15:22

bwplotka requested a review from metalmatze January 6, 2020 15:22

pracucci approved these changes Jan 7, 2020

View reviewed changes

bwplotka merged commit d9e4e0e into master Jan 7, 2020

bwplotka deleted the index-cache-binary branch January 7, 2020 14:57

bwplotka mentioned this pull request Jan 8, 2020

Added binary index header implementation. #1952

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added proposal for moving to index-header binary format. #1839

Added proposal for moving to index-header binary format. #1839

bwplotka commented Dec 4, 2019

brancz commented Dec 5, 2019

GiedriusS Dec 5, 2019

bwplotka Dec 6, 2019

GiedriusS Dec 11, 2019 •

edited

GiedriusS Dec 5, 2019

bwplotka Dec 6, 2019

bwplotka left a comment

bwplotka Dec 6, 2019

bwplotka Jan 6, 2020

pracucci Dec 11, 2019

pracucci Dec 11, 2019

pracucci Dec 11, 2019

bwplotka Dec 11, 2019

pracucci Dec 11, 2019

pracucci Dec 11, 2019

bwplotka left a comment

bwplotka Jan 6, 2020

bwplotka commented Jan 6, 2020

pracucci left a comment

pracucci Jan 7, 2020

bwplotka Jan 7, 2020


		TSDB index is in binary [format](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/index.md).

		To allow reduced resource consumption and effort when building (1), (2), "index-header" for blocks we plan to reuse similar format for sections like symbols, label indices and

Added proposal for moving to index-header binary format. #1839

Added proposal for moving to index-header binary format. #1839

Conversation

bwplotka commented Dec 4, 2019

brancz commented Dec 5, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GiedriusS Dec 11, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka commented Jan 6, 2020

pracucci left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GiedriusS Dec 11, 2019 •

edited