Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No repair mechanism for Overlapped Blocks #611

Closed
Keshav0690 opened this issue Nov 3, 2018 · 4 comments
Closed

No repair mechanism for Overlapped Blocks #611

Keshav0690 opened this issue Nov 3, 2018 · 4 comments

Comments

@Keshav0690
Copy link

Thanos, Prometheus and Golang version used

Prometheus: 2.4.3
Thanos docker image: improbable/thanos:master-2018-10-11-a09d41d

What happened
Thanos compactor failed with error - time ranges overlapped. This happens because overlapped blocks in prometheus has been pushed to S3 bucket by thanos.
After that I tried to run the verify bucket with repair = true, but I get following error

msg="repair is not implemented for this issue" issue=overlapped_blocks

What you expected to happen
Repair should delete the overlapped blocks from bucket

How to reproduce it (as minimally and precisely as possible):
No idea how to generate the overlapped blocks manually.

Full logs to relevant components

Logs

/var/thanos/storage # /bin/thanos bucket verify "--log.level=debug" "--objstore.config-file=/bucket_config/bucket.yaml" "--repair" "--objstore-backup.config-file=/bucket1.yaml" level=info ts=2018-11-02T22:01:47.363727486Z caller=factory.go:34 msg="loading bucket configuration" level=info ts=2018-11-02T22:01:47.364131365Z caller=factory.go:34 msg="loading bucket configuration" level=warn ts=2018-11-02T22:01:47.36429269Z caller=verify.go:49 msg="GLOBAL COMPACTOR SHOULD __NOT__ BE RUNNING ON THE SAME BUCKET" issues=2 repair=true level=info ts=2018-11-02T22:01:47.364317625Z caller=index_issue.go:27 msg="started verifying issue" with-repair=true issue=index_issue level=info ts=2018-11-02T22:28:54.561310649Z caller=index_issue.go:128 msg="verified issue" with-repair=true issue=index_issue level=info ts=2018-11-02T22:28:54.561761213Z caller=overlapped_blocks.go:25 msg="started verifying issue" with-repair=true issue=overlapped_blocks level=warn ts=2018-11-02T22:39:01.278844978Z caller=overlapped_blocks.go:38 msg="found overlapped blocks" group="0@{replica=\"A\"}" overlap="[mint: 1541095200000, maxt: 1541097000000, range: 30m0s, blocks: 686]:

level=warn ts=2018-11-02T22:39:01.36066036Z caller=overlapped_blocks.go:42 msg="repair is not implemented for this issue" issue=overlapped_blocks
level=info ts=2018-11-02T22:39:01.360782302Z caller=verify.go:68 msg="verify completed" issues=2 repair=true
level=info ts=2018-11-02T22:39:01.361189129Z caller=main.go:173 msg=exiting

Anything else we need to know

Uncomment and fill if you use not casual environment or if it might be relevant.

Environment:

  • OS (e.g. from /etc/os-release): oraclelinux:7-slim
  • Kernel (e.g. uname -a):
  • Others:

Any help is appreciated.

Thanks
Keshav

@bwplotka
Copy link
Member

bwplotka commented Nov 3, 2018 via email

@Keshav0690
Copy link
Author

Hi,

Yes, I did change the min time for the block and then reverted back it to 2h. Seems this is the only reason for so many overlapped blocked :)
And yeah, deleting is the only way to fix this. Now I am using AWS cli to delete blocks. Let's see if this can fix it.
Ques: Can't this be added as repair mechanism in bucket verify option, if it found overlapped blocks?

Thanks,
Keshav Sharma

@Keshav0690
Copy link
Author

Hi Bartek,

Deleted the blocks from S3 bucket, and now compactor executed successfully.
Thanks.

Regards,
Keshav Sharma

@GiedriusS
Copy link
Member

Hello there, I am going through old issues here. But we don't want to delete all of the blocks that overlap, this should be a manual operation that you must do carefully after deliberating for a bit. If we were to delete all of the overlapping blocks, some information would be lost which is not what you want in most of the cases. I'm glad that you figured this out with the flags and deleting the blocks. This should explicitly tell you when this will happen in the future: #838. Closing for now, thank you for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants