Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tsdb: stop saving a copy of last 4 samples in memSeries #11296

Merged
merged 5 commits into from
Sep 27, 2022

Conversation

bboreham
Copy link
Member

@bboreham bboreham commented Sep 11, 2022

This extra copy of the last 4 samples was introduced to avoid a race condition between reading the last byte of the chunk and writing to it. However this race condition was fixed in f42ed03, so we don't need the copy any more.

EDIT: But we can copy just the last byte in bstreamReader instead.

This change saves 56 bytes per series, which is very worthwhile when you have millions of series.

(#11280 and #11288 each remove another 8 bytes)

I kept the head snapshot binary compatible by writing zeros where the samples used to be and ignoring any values found, but perhaps we should bump the version and save the work.

@bboreham bboreham marked this pull request as draft September 12, 2022 08:19
@bboreham
Copy link
Member Author

Sadly the unit tests say that the race is not avoided. I will look some more.

@pracucci
Copy link
Contributor

I don't think my change to bstreamReader fixes the race condition you talk about in this PR.

My change assumes the caller never tries to read the last byte if it's unsafe to do. So, in bstreamReader.loadNextBuffer() I make sure to never read from the last byte unless it was explicitly requested to do (via the number of nbits passed in input). The assumption that the caller never tries to read the last byte if it's unsafe to do, is because we keep the last 4 samples uncompressed in memory, which is what you're removing in this PR.

@bboreham bboreham force-pushed the remove-4-sample-copy branch 2 times, most recently from c67f1ab to 4b030a6 Compare September 17, 2022 11:16
@bboreham bboreham marked this pull request as ready for review September 17, 2022 11:16
@bboreham
Copy link
Member Author

I have updated this PR to take a copy of the last byte in bstreamReader.
Also added comments to clarify that concurrent Iterator and Appender are allowed, but the chunk must not be modified while an Iterator is created. (This was already the case, in order to copy the bstream slice header.)

Because the data is stored as a bit-stream, the last byte in the stream
could change if the stream is appended to after an Iterator is obtained.
Copy the last byte when the Iterator is created, so we don't have to
read it later.

Clarify in comments that concurrent Iterator and Appender are allowed,
but the chunk must not be modified while an Iterator is created.
(This was already the case, in order to copy the bstream slice header.)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
This extra copy of the last 4 samples was introduced to avoid a race
condition between reading the last byte of the chunk and writing to it.

But now we have fixed that by having `bstreamReader` copy the last byte,
we don't need to copy the last 4 samples.

This change saves 56 bytes per series, which is very worthwhile when
you have millions or tens of millions of series.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Previous changes have left this code duplicating some lines; pull
them out to a separate function and tidy up.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, Bryan! I double checking the locking when appending samples to the chunk and creating the iterator, and I can't find any issue 👏

The behaviour has changed so chunk iterators are only wrapped when
transaction isolation requires them to stop short of the end.
This makes tests fail which are checking the type.

Tests should check the observable behaviour, not the type.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
@codesome codesome mentioned this pull request Sep 23, 2022
8 tasks
Copy link
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! LGTM

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants