Recreate meta-data from data #2209

rektide · 2016-11-20T02:19:02Z

What did you do?

Let Prometheus run. But one day I turned my system on and Prometheus did not boot because of #1967, the meta-data/indexes (stored in LevelDB) being corrupted. What Prometheus did, in conjunction with LevelDB, was make itself un-startable.

What did you expect to see?

Some means to continue running or recovering Prometheus. I still have all the data files for Prometheus, and I expect to be able to use the actual data to reconstruct the data indexes that have been lost.

What did you see instead? Under which circumstances?

No way to recover the system. I had to nuke the instance & lose all the metrics.

Environment

Debian/testing

System information:

Linux 4.4.1

Prometheus version:

prometheus, version 1.3.1 (branch: master, revision: 9b7e097a76034989212c752921a80cf73a4c3ff0)
  build user:       rektide@sunstripe
  build date:       20161120-01:49:56
  go version:       go1.7.3

Alertmanager version:

N/A

Prometheus configuration file:

global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'example'

# Load and evaluate rules in this file every 'evaluation_interval' seconds.
rule_files:


# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  - job_name: node
    static_configs:
    -   targets:
        - localhost:9100

  - job_name: prometheus
    static_configs:
    -   targets:
        - localhost:9090

Alertmanager configuration file:

N/A

Logs:

Nov 19 20:52:13 sunstripe systemd[1]: Started prometheus-main.
Nov 19 20:52:13 sunstripe prometheus[3135]: time="2016-11-19T20:52:13-05:00" level=info msg="Starting prometheus (version=1.3.1, branch=master, revision=9b7e097a76034989212c752921a80cf73a4c3ff0)" source="main.go:75"
Nov 19 20:52:13 sunstripe prometheus[3135]: time="2016-11-19T20:52:13-05:00" level=info msg="Build context (go=go1.7.3, user=rektide@sunstripe, date=20161120-01:49:56)" source="main.go:76"
Nov 19 20:52:13 sunstripe prometheus[3135]: time="2016-11-19T20:52:13-05:00" level=info msg="Loading configuration file /etc/opt/prometheus-main/prometheus.yml" source="main.go:247"
Nov 19 20:52:13 sunstripe prometheus[3135]: time="2016-11-19T20:52:13-05:00" level=error msg="Error opening memory series storage: leveldb/storage: corrupted or incomplete meta file" source="main.go:181"

The text was updated successfully, but these errors were encountered:

brian-brazil · 2016-11-20T02:33:01Z

This isn't possible as the chunks only contain time series values and timestamps, they do not contain the name or labels of the timeseries. That data is only kept in leveldb.

lock · 2019-03-24T03:19:48Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

brian-brazil closed this as completed Nov 20, 2016

lock bot locked and limited conversation to collaborators Mar 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recreate meta-data from data #2209

Recreate meta-data from data #2209

rektide commented Nov 20, 2016

brian-brazil commented Nov 20, 2016

lock bot commented Mar 24, 2019

Recreate meta-data from data #2209

Recreate meta-data from data #2209

Comments

rektide commented Nov 20, 2016

brian-brazil commented Nov 20, 2016

lock bot commented Mar 24, 2019