Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upMissing data / Prometheus data store corruption? #3233
Comments
This comment has been minimized.
This comment has been minimized.
rbobrovnikov
commented
Oct 10, 2017
|
I have same behavior. Prometheus "forgets" previous data, and sometimes stucks after few days, and became unreachable. |
This comment has been minimized.
This comment has been minimized.
|
The
There are You should definitely upgrade to the latest 1.x release (currently 1.8.0) and then see what problems there still are. If you still see problems, a deeper investigation might be in order. |
This comment has been minimized.
This comment has been minimized.
|
To be clear: If you have GIgabytes of data in The nastiest data corruption caused by a bug in Prometheus was happening in 1.5.0 and 1.5.1. Those corruption could lay dormant in your storage for a while before the suddenly become apparent (resulting in quarantine, even while running on later releases). But in your case, I suspect some external corruption as even this bug only corrupted a small fraction of series. |
brian-brazil
added
the
kind/question
label
Dec 8, 2017
This comment has been minimized.
This comment has been minimized.
|
Closing this as wont-fix as it is superseded by the new 2.0 storage. |
gouthamve
closed this
Jan 18, 2018
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 23, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
cubranic commentedOct 2, 2017
We noticed that our Prometheus database seems to be missing data going back more than a few days. This coincides with the time we had to restart the Prometheus systemd service, which had gone unresponsive.
I'm not familiar with the structure of in /var/lib/prometheus/data, but I noticed that there is 2.2 GB of data in /var/lib/prometheus/data/orphans, all dated from around the time shortly before the restart when the server had gone unresponsive.
Environment
The journal only shows entries since the service was last restarted, and there is nothing in it indicating data errors or corruption: