Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added proposal for moving to index-header binary format. #1839

Merged
merged 3 commits into from
Jan 7, 2020

Conversation

bwplotka
Copy link
Member

@bwplotka bwplotka commented Dec 4, 2019

Some verifications in TODO, otherwise. Feedback welcome.

Definitely not a silver bullet here, but rather change in a good direction (:

Signed-off-by: Bartek Plotka bwplotka@gmail.com

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
@brancz
Copy link
Member

brancz commented Dec 5, 2019

Love the gist of this. I haven't had the chance to do a very thorough review, but I think this is only natural to do. Nice job!

* Remove building `index-cache.json` in compactor.
Step 2: Load on demand in query time:
* Allow Store Gateway to load `index-header` on demand from disk in query time.
* Add background job to unmmap/unload `index-header` for blocks that are unused (LRU).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just unused but the least recently used blocks if the memory limit is violated, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we would know this?

Copy link
Member

@GiedriusS GiedriusS Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant some kind of limit in terms of their size on disk because it would be really nice to avoid RAM usage blowing up in case of some queries. Having all of it in RAM is horrible, yes, but OTOH it gives the users a relatively certain thing: we know that at least that RAM usage is more or less equal over a period of time due to the retention policies being applied by the Thanos Compact component. Unless I am missing something but it seems that this is true.


TSDB index is in binary [format](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/index.md).

To allow reduced resource consumption and effort when building (1), (2), "index-header" for blocks we plan to reuse similar format for sections like symbols, label indices and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I have missed this detail but what is the actual format that this proposes? Protobufs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A custom binary format was planned, composed with those sections as mentioned, however, I might find a blocker to this ):

@bwplotka bwplotka mentioned this pull request Dec 6, 2019
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Copy link
Member Author

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks all for the review! Updated, with the new info, not a good news ):


## Risks

### Posting size to fetch
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this which we missed initially in this design. Let me know if it's clear.

TL;DR: we scan postings form TSDB index when building index-cache and without changing TSDB index this will be still required building index-header

@brancz @GiedriusS

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We got some answers (:

* Check if index-cache.json is present in the bucket. If not:
* Download the whole TSDB index file, mmap all of it.
* Build index-cache.json
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silly thing: once this proposal will be displayed as markdown, the else: won't be formatted the way I guess you want.

* Check if index-cache.json is present on the disk. If not:
* Check if index-cache.json is present in the bucket. If not:
* Download the whole TSDB index file, mmap all of it.
* Build index-cache.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this one, I would also add another point * Delete downloaded TSDB index to outline the waste of resources to download the TSDB index file just to build the cache (it won't be re-used the fully downloaded TSDB index).

## Goals

* Reduce confusion between index-cache.json and IndexCache for series and postings.
* Do not keep large pieces of the blocks (e.g symbols, label values) in the memory for all blocks. Be able to load it quickly in query time from disk.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in query time: I guess you mean at query time. If this comment makes sense, then I would suggest you to look for in query time around the doc too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do thanks

## No Goals

* Removing initial startup for Thanos Store Gateway completely as designed in [Cortex, no initial block sync](https://github.com/thanos-io/thanos/issues/1813)
* However this proposal might a step towards that as we might be able to load index-cache/index quickly on demand from disk.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might > might be

The process for building this will be as follows:

* Thanks to https://github.com/thanos-io/thanos/pull/1792 we can check final size of index and scan for TOC file.
* With TOC:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I think you also need to add a new bullet point at the end for writing the index-header TOC. I guess you need it, right? Otherwise I'm not sure how you could read back from this file.

docs/proposals/201912_thanos_binary_index_header.md Outdated Show resolved Hide resolved
docs/proposals/201912_thanos_binary_index_header.md Outdated Show resolved Hide resolved
docs/proposals/201912_thanos_binary_index_header.md Outdated Show resolved Hide resolved
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Copy link
Member Author

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All updated (:

docs/proposals/201912_thanos_binary_index_header.md Outdated Show resolved Hide resolved

## Risks

### Posting size to fetch
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We got some answers (:

@bwplotka
Copy link
Member Author

bwplotka commented Jan 6, 2020

Starting implementation (:

cc @GiedriusS @pracucci if you want to take another look.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good to me! Can't wait to see it live!

While idea of combing different pieces of TSDB index as our index-header is great, unfortunately we heavily rely
on information about size of each posting represented as `postingRange.End`.

We need to know that apriori to know how to partition and how many bytes we need to fetch from the storage to get each posting: https://github.com/thanos-io/thanos/blob/7e11afe64af0c096743a3de8a594616abf52be45/pkg/store/bucket.go#L1567
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to know that apriori to know how to partition

I guess you mean We need to know apriori how to partition

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in next PR (:

@bwplotka bwplotka merged commit d9e4e0e into master Jan 7, 2020
@bwplotka bwplotka deleted the index-cache-binary branch January 7, 2020 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants