-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opening storage failed" err="invalid block sequence" #3714
Comments
I think it may like Prometheus 2.0 fails to start up after couple of restarts #3191. tell me if you need more details, Thanks. |
logs on k8s node here:
|
|
probably like Prom2: crash on opening WAL block #2795 |
Any updates here? |
I am also seeing this in Prometheus v2.2.0 Edit: I will add more info as I uncover it. |
I just hit this same error with prometheus v2.2.0 (i have installed this version fresh some days ago). details:
|
i just checked the logs further and the issue already appeared without crashing at runtime before:
|
We also have 2.2.0 and this issue has few additional symptoms:
I hope it could help diagnose a problem. |
Also presenting this problem in my setup, running prometheus v2.2.0 with a empty DB. After some hours 2~3 prometheus start generating this error.
Then was not able to recover and start to fail again and again, showing the error below:
|
Sorry about that, this is a bug, fix is here: prometheus-junkyard/tsdb#299 A new bug fix release will be out soon. |
@gouthamve i hit it again, but this time rolling back the data to some time in the past (zfs snapshots) wouldn't work as prometheus started compacting block after startup and hit the issue after a couple of seconds. So i grabbed the linked patch, compiled prometheus and running the master + the patch version and it's fine so far, thx. |
the same issue met at prometheus v2.2.0
|
Please try 2.2.1. |
@brian-brazil It worked fine after I deleted all old data when update the prometheus v2.2.0 to v2.2.1 . |
Dupe of #3943. |
@brian-brazil Hi, I'm hitting this issue with v2.2.1. Does this issue need to be re-open?
Thanks. |
Is there a way to recover from this error without flushing data out? I don't want to lose a chunk of my metrics data because of this :| |
@bamb00 Any update about this? |
@zhanglijingisme I have not heard back from the prometheus team. |
After upgrading from v2.2.1 to v2.3.0 I got this error:
I have kept the old data via Note: the thing which prompted the upgrade was that prometheus had starting doing much more disk I/O than expected, and was saturating the underlying hard drives. It's a relatively small set of time series which are being monitored - |
I have duplicated the behavior reported by @candlerb when upgrading from 2.2.1 to 2.3.1. |
Here's how it went for me (running docker container with
Note the last two directories, they're the heaviest. If you check
Started prometheus again, voila, works again, and the data is there and accessible (I can see it by running queries from the very beginning of monitoring history). Hope it'd help somebody. |
I had the same issue in windows environment with 2.3.*. I updated to 2.4.3 version and still didn't work. |
We also faced similar issue with 2.3.2 .We had to move the data from existing path mentioned under storage.tsdb.path to a new location and restart prometheus |
I faced the same issue in the version 2.3.2. I tried deleting duplicated chunks and restarting, it didn't work. Finally I had to move the whole data block to a different folder, create one more empty data folder and restart the prometheus service to make it work. |
same in 2.7.0 Help fix from @uncleNight with move bad files. |
had the same issue after killing Prometheus. i've removed all *.tmp and all the directories which reported in the log . |
@uncleNight's solution worked for me! applying the solution as described, I saw large gaps in the data (i use grafana for dashboarding). To fill those, I moved back all of the "the latest (heaviest)" directories. Now all the data looks great! btw, the overall problem for me got triggered by running out of disk space. |
What did you do?
I ran prometheus2.0.0 on kubernetesv1.8.5
What did you expect to see?
Everything went well.
What did you see instead? Under which circumstances?
Everything went well at beginning. But several hours later, pods' statuses turned to "CrashLoopBackOff", all prometheus turned unavaliable. After create pods, I didnt do anything.
Environment
System information:
Prometheus version:
v2.0.0
Prometheus configuration file:
Any suggestions?
The text was updated successfully, but these errors were encountered: