New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No repair mechanism for Overlapped Blocks #611
Comments
Hi,
It is unlikely that this happens: "This happens because overlapped blocks
in prometheus has been pushed to S3 bucket by thanos." unless something was
misconfigured.
How the overlap happened? Can see there are quite many overlapped blocks
that overlap with 30min.. have you changed min time block? Which should be
2h? (: We need to figure out why you end up having overlaps. In healty
setup there should be none.
One misconfiguration that is pretty often is when you set the same external
labels for couple of prometheuses.
Overall there is no automated way to fix overlap other than... Deleting
random block that overlap. How would you expect thanos to fix this? This
sounds like a blocks that overlap have been duplicated but you never know (:
…On Sat, Nov 3, 2018, 07:23 Keshav0690 ***@***.***> wrote:
*Thanos, Prometheus and Golang version used*
Prometheus: 2.4.3
Thanos docker image: improbable/thanos:master-2018-10-11-a09d41d
*What happened*
Thanos compactor failed with error - time ranges overlapped. This happens
because overlapped blocks in prometheus has been pushed to S3 bucket by
thanos.
After that I tried to run the verify bucket with repair = true, but I get
following error
msg="repair is not implemented for this issue" issue=overlapped_blocks
*What you expected to happen*
Repair should delete the overlapped blocks from bucket
*How to reproduce it (as minimally and precisely as possible)*:
No idea how to generate the overlapped blocks manually.
*Full logs to relevant components*
Logs
/var/thanos/storage # /bin/thanos bucket verify "--log.level=debug"
"--objstore.config-file=/bucket_config/bucket.yaml" "--repair"
"--objstore-backup.config-file=/bucket1.yaml" level=info
ts=2018-11-02T22:01:47.363727486Z caller=factory.go:34 msg="loading bucket
configuration" level=info ts=2018-11-02T22:01:47.364131365Z
caller=factory.go:34 msg="loading bucket configuration" level=warn
ts=2018-11-02T22:01:47.36429269Z caller=verify.go:49 msg="GLOBAL COMPACTOR
SHOULD __NOT__ BE RUNNING ON THE SAME BUCKET" issues=2 repair=true
level=info ts=2018-11-02T22:01:47.364317625Z caller=index_issue.go:27
msg="started verifying issue" with-repair=true issue=index_issue level=info
ts=2018-11-02T22:28:54.561310649Z caller=index_issue.go:128 msg="verified
issue" with-repair=true issue=index_issue level=info
ts=2018-11-02T22:28:54.561761213Z caller=overlapped_blocks.go:25
msg="started verifying issue" with-repair=true issue=overlapped_blocks
level=warn ts=2018-11-02T22:39:01.278844978Z caller=overlapped_blocks.go:38
msg="found overlapped blocks" group="0@{replica=\"A\"}" overlap="[mint:
1541095200000, maxt: 1541097000000, range: 30m0s, blocks: 686]:
level=warn ts=2018-11-02T22:39:01.36066036Z caller=overlapped_blocks.go:42
msg="repair is not implemented for this issue" issue=overlapped_blocks
level=info ts=2018-11-02T22:39:01.360782302Z caller=verify.go:68
msg="verify completed" issues=2 repair=true
level=info ts=2018-11-02T22:39:01.361189129Z caller=main.go:173 msg=exiting
*Anything else we need to know*
Uncomment and fill if you use not casual environment or if it might be
relevant.
*Environment*:
- OS (e.g. from /etc/os-release): oraclelinux:7-slim
- Kernel (e.g. uname -a):
- Others:
Any help is appreciated.
Thanks
Keshav
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#611>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGoNu0CUmAynxIQ27AMh-J6fIEA8bZUJks5urURdgaJpZM4YMreD>
.
|
Hi, Yes, I did change the min time for the block and then reverted back it to 2h. Seems this is the only reason for so many overlapped blocked :) Thanks, |
Hi Bartek, Deleted the blocks from S3 bucket, and now compactor executed successfully. Regards, |
Hello there, I am going through old issues here. But we don't want to delete all of the blocks that overlap, this should be a manual operation that you must do carefully after deliberating for a bit. If we were to delete all of the overlapping blocks, some information would be lost which is not what you want in most of the cases. I'm glad that you figured this out with the flags and deleting the blocks. This should explicitly tell you when this will happen in the future: #838. Closing for now, thank you for the report! |
Thanos, Prometheus and Golang version used
Prometheus: 2.4.3
Thanos docker image: improbable/thanos:master-2018-10-11-a09d41d
What happened
Thanos compactor failed with error - time ranges overlapped. This happens because overlapped blocks in prometheus has been pushed to S3 bucket by thanos.
After that I tried to run the verify bucket with repair = true, but I get following error
msg="repair is not implemented for this issue" issue=overlapped_blocks
What you expected to happen
Repair should delete the overlapped blocks from bucket
How to reproduce it (as minimally and precisely as possible):
No idea how to generate the overlapped blocks manually.
Full logs to relevant components
/var/thanos/storage # /bin/thanos bucket verify "--log.level=debug" "--objstore.config-file=/bucket_config/bucket.yaml" "--repair" "--objstore-backup.config-file=/bucket1.yaml" level=info ts=2018-11-02T22:01:47.363727486Z caller=factory.go:34 msg="loading bucket configuration" level=info ts=2018-11-02T22:01:47.364131365Z caller=factory.go:34 msg="loading bucket configuration" level=warn ts=2018-11-02T22:01:47.36429269Z caller=verify.go:49 msg="GLOBAL COMPACTOR SHOULD __NOT__ BE RUNNING ON THE SAME BUCKET" issues=2 repair=true level=info ts=2018-11-02T22:01:47.364317625Z caller=index_issue.go:27 msg="started verifying issue" with-repair=true issue=index_issue level=info ts=2018-11-02T22:28:54.561310649Z caller=index_issue.go:128 msg="verified issue" with-repair=true issue=index_issue level=info ts=2018-11-02T22:28:54.561761213Z caller=overlapped_blocks.go:25 msg="started verifying issue" with-repair=true issue=overlapped_blocks level=warn ts=2018-11-02T22:39:01.278844978Z caller=overlapped_blocks.go:38 msg="found overlapped blocks" group="0@{replica=\"A\"}" overlap="[mint: 1541095200000, maxt: 1541097000000, range: 30m0s, blocks: 686]:
level=warn ts=2018-11-02T22:39:01.36066036Z caller=overlapped_blocks.go:42 msg="repair is not implemented for this issue" issue=overlapped_blocks
level=info ts=2018-11-02T22:39:01.360782302Z caller=verify.go:68 msg="verify completed" issues=2 repair=true
level=info ts=2018-11-02T22:39:01.361189129Z caller=main.go:173 msg=exiting
Anything else we need to know
Uncomment and fill if you use not casual environment or if it might be relevant.
Environment:
uname -a
):Any help is appreciated.
Thanks
Keshav
The text was updated successfully, but these errors were encountered: