You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let Prometheus run. But one day I turned my system on and Prometheus did not boot because of #1967, the meta-data/indexes (stored in LevelDB) being corrupted. What Prometheus did, in conjunction with LevelDB, was make itself un-startable.
What did you expect to see?
Some means to continue running or recovering Prometheus. I still have all the data files for Prometheus, and I expect to be able to use the actual data to reconstruct the data indexes that have been lost.
What did you see instead? Under which circumstances?
No way to recover the system. I had to nuke the instance & lose all the metrics.
Environment
Debian/testing
System information:
Linux 4.4.1
Prometheus version:
prometheus, version 1.3.1 (branch: master, revision: 9b7e097a76034989212c752921a80cf73a4c3ff0)
build user: rektide@sunstripe
build date: 20161120-01:49:56
go version: go1.7.3
Alertmanager version:
N/A
Prometheus configuration file:
global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.
evaluation_interval: 15s # By default, scrape targets every 15 seconds.
# scrape_timeout is set to the global default (10s).
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'example'
# Load and evaluate rules in this file every 'evaluation_interval' seconds.
rule_files:
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
- job_name: node
static_configs:
- targets:
- localhost:9100
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
Alertmanager configuration file:
N/A
Logs:
Nov 19 20:52:13 sunstripe systemd[1]: Started prometheus-main.
Nov 19 20:52:13 sunstripe prometheus[3135]: time="2016-11-19T20:52:13-05:00" level=info msg="Starting prometheus (version=1.3.1, branch=master, revision=9b7e097a76034989212c752921a80cf73a4c3ff0)" source="main.go:75"
Nov 19 20:52:13 sunstripe prometheus[3135]: time="2016-11-19T20:52:13-05:00" level=info msg="Build context (go=go1.7.3, user=rektide@sunstripe, date=20161120-01:49:56)" source="main.go:76"
Nov 19 20:52:13 sunstripe prometheus[3135]: time="2016-11-19T20:52:13-05:00" level=info msg="Loading configuration file /etc/opt/prometheus-main/prometheus.yml" source="main.go:247"
Nov 19 20:52:13 sunstripe prometheus[3135]: time="2016-11-19T20:52:13-05:00" level=error msg="Error opening memory series storage: leveldb/storage: corrupted or incomplete meta file" source="main.go:181"
The text was updated successfully, but these errors were encountered:
This isn't possible as the chunks only contain time series values and timestamps, they do not contain the name or labels of the timeseries. That data is only kept in leveldb.
What did you do?
Let Prometheus run. But one day I turned my system on and Prometheus did not boot because of #1967, the meta-data/indexes (stored in LevelDB) being corrupted. What Prometheus did, in conjunction with LevelDB, was make itself un-startable.
What did you expect to see?
Some means to continue running or recovering Prometheus. I still have all the data files for Prometheus, and I expect to be able to use the actual data to reconstruct the data indexes that have been lost.
What did you see instead? Under which circumstances?
No way to recover the system. I had to nuke the instance & lose all the metrics.
Environment
Debian/testing
Linux 4.4.1
N/A
N/A
The text was updated successfully, but these errors were encountered: