Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

return err instead of panic for corrupted chunk #6040

Conversation

krasi-georgiev
Copy link
Contributor

@krasi-georgiev krasi-georgiev commented Sep 19, 2019

check that the chunk segment has enough data to read all chunk pieces.

fixes: #5991
fixes: thanos-io/thanos#1467

@zhulongcheng - while reviewing your pr in #5991 there were many things I didn't understand so had to refactor and rename few variables to make the workflow more clear. Feel free to copy and paste the code from here or just review this PR.

Again sorry for hijacking the PR, I just did soo many changes to understand the code properly that it only made sense to open a separate PR

check that the chunk segment has enough data to read all chunk pieces.

fixes: prometheus#5991
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
CHANGELOG.md Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/block_test.go Outdated Show resolved Hide resolved
@zhulongcheng
Copy link
Contributor

zhulongcheng commented Sep 19, 2019

Nice, and thanks for help this. 👍

(just added some comments. I am sorry if these comments disturb this pr)

@krasi-georgiev krasi-georgiev self-assigned this Sep 19, 2019
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev
Copy link
Contributor Author

@zhulongcheng updated, thanks for the review!

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Copy link
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the panic is avoided by

if chkEnd > sgmBytes.Len() {
return nil, errors.Errorf("segment doesn't include enough bytes to read the chunk - required:%v, available:%v", chkEnd, sgmBytes.Len())
}
and adding +maxChunkLengthFieldSize here
if sgmChunkStart+maxChunkLengthFieldSize > sgmBytes.Len() {
return nil, errors.Errorf("segment doesn't include enough bytes to read the chunk size data field - required:%v, available:%v", sgmChunkStart+maxChunkLengthFieldSize, sgmBytes.Len())
}

right?

PS: I haven't checked the changes in the tests yet

tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Show resolved Hide resolved
tsdb/chunks/chunks.go Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
@krasi-georgiev
Copy link
Contributor Author

If I understand correctly, the panic is avoided by...

correct

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
…idioms-comments

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Copy link
Contributor Author

@krasi-georgiev krasi-georgiev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried ti simplify it a bit and added a test to ensure the correct behavior.

tsdb/chunks/chunks.go Show resolved Hide resolved
tsdb/chunks/chunks.go Show resolved Hide resolved
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev
Copy link
Contributor Author

ping @codesome

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev
Copy link
Contributor Author

@codesome @zhulongcheng would appreciate one final review before merging this.

@codesome codesome self-assigned this Oct 9, 2019
@codesome
Copy link
Member

Taking a look at this today

Copy link
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 possible enhancement, LGTM otherwise

tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
@krasi-georgiev
Copy link
Contributor Author

Thanks appreciated.

…ioms-comments

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev
Copy link
Contributor Author

@codesome just resolved the conflict with master to ready for a review when you find the time.

@krasi-georgiev
Copy link
Contributor Author

ping @zhulongcheng @codesome

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
…ioms-comments

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
tsdb/db_test.go Outdated Show resolved Hide resolved
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Copy link
Contributor

@zhulongcheng zhulongcheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

…ioms-comments

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some suggestions 👍

tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/db_test.go Outdated Show resolved Hide resolved
tsdb/chunks/chunks.go Show resolved Hide resolved
tsdb/chunks/chunks.go Outdated Show resolved Hide resolved
tsdb/compact.go Outdated Show resolved Hide resolved
tsdb/db_test.go Outdated Show resolved Hide resolved
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev
Copy link
Contributor Author

all comments addressed, thanks!

@krasi-georgiev krasi-georgiev merged commit 549164f into prometheus:master Dec 4, 2019
@krasi-georgiev krasi-georgiev deleted the demistify-chunks-idioms-comments branch December 4, 2019 07:37
@krasi-georgiev
Copy link
Contributor Author

@codesome if you see any other problems please ping me and will open another PR

@codesome
Copy link
Member

codesome commented Dec 4, 2019

Was planning to review today, but the changes looked fine earlier and the above approvals should be enough :)

@krasi-georgiev
Copy link
Contributor Author

It is not late :) if you see anything let me know and will open another PR.

Copy link
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you go! Hopefully, we can make this small change before the next release

batchID++
batchSize = chkSize
}
batches[batchID] = chks[batchStart : i+1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should ideally be done when (1) We cut a new batch (2) When its the last chunk. Else it is going to create a new slide header for every chunk.

Additionally (not a blocker), we could write the chunks as soon as we hit this case instead of collecting multiple batches, what say?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should ideally be done when (1) We cut a new batch (2) When its the last chunk. Else it is going to create a new slide header for every chunk.

hm, I am not sure I understand the idea, can you give an example or maybe even test it and if it passes the tests just open a PR and I will review quickly.

Additionally (not a blocker), we could write the chunks as soon as we hit this case instead of collecting multiple batches, what say?

I think I tried this, but there was some other problem there, can't remember exactly. Maybe again try it and if it passes the tests than I will review quickly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought getting a sub-slice in every iteration would cause extra allocations, but running BenchmarkCompaction shows me that apparently it does not. Additionally, I see some regression in performance in ns/op of nearly 7-9%, but I cannot say if it is this PR itself, so that would need some pprof action I suppose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for running the bench test. I did it this way as it was easyer to follow, and I also did a prombench test and didn't see any difference in the performance.

the regression might be due to the extra cheksum checking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

comapct: panics with error: panic: runtime error: slice bounds out of range
5 participants