Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upData loss after restart #4028
Comments
This comment has been minimized.
This comment has been minimized.
|
This seems to primarily happen after two hours of running and collecting data from my observations but I'm going to try to pin that down tonight. |
This comment has been minimized.
This comment has been minimized.
|
As long as the server didn't restart it retained its data. As soon as I ran
I think one of the worst things about this is it produces zero errors. |
This comment has been minimized.
This comment has been minimized.
|
Hmm, Could you share the service file for prometheus too? |
This comment has been minimized.
This comment has been minimized.
|
This comment has been minimized.
This comment has been minimized.
|
Just a brief look over my Prometheus UI, it looks like storage.tsdb.min-block-duration lines up with my loss timeline which is about every 2 hours. |
This comment has been minimized.
This comment has been minimized.
|
I added a discovery folder to /var/lib/prometheus and I'm getting some new stuff:
those base folders with random strings and integers never existed before, so that's new. discovery also has two new folders. Before all that existed was lock and wall with some files inside wal. |
This comment has been minimized.
This comment has been minimized.
|
Perms for tsdb data dir
|
This comment has been minimized.
This comment has been minimized.
|
I've tried restarting the server after having not restart it for 12 hours and it restarted fine, retaining all data. I'm not sure what's led up to this. |
This comment has been minimized.
This comment has been minimized.
|
I'll close this issue for now, however, a root cause definitely would've been nice to find. |
mattouille
closed this
Mar 30, 2018
This comment has been minimized.
This comment has been minimized.
TroubleshooteRJ
commented
Aug 7, 2018
|
hey mattouille, Did you find the root cause ?? I am scared to restart the prometheus after my changes of increasing the disk size. |
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 22, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
mattouille commentedMar 30, 2018
•
edited
What did you do?
Use systemd to restart prometheus (
systemctl restart prometheus)What did you expect to see?
Prometheus restarts and data is retained
What did you see instead? Under which circumstances?
Prometheus restarts, I saw all my data from when I started Prometheus, then a few seconds goes by and my data is cut down to about five minutes.
Environment
System information:
Linux 4.4.0-87-generic x86_64
Prometheus version:
prometheus, version 2.2.1 (branch: HEAD, revision: bc6058c)
build user: root@149e5b3f0829
build date: 20180314-14:15:45
go version: go1.10
alertmanager, version 0.14.0 (branch: HEAD, revision: 30af4d051b37ce817ea7e35b56c57a0e2ec9dbb0)
build user: root@37b6a49ebba9
build date: 20180213-08:16:42
go version: go1.9.2
N/a
No error is output from the logs, that I can see, just a very weird behavior. I've now deleted my WAL and am trying to isolate the issue further.