Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics corruption #388

Closed
grobie opened this Issue Apr 11, 2014 · 4 comments

Comments

Projects
None yet
2 participants
@grobie
Copy link
Member

grobie commented Apr 11, 2014

Steps:

  • renamed some metrics (due to rollout of prometheus/haproxy_exporter 0.3.0)
  • renamed some rules in a haproxy related rules file
  • restarted server in order to pick up the changed rules file

Result:

  • server was chrashlooping until the storage directory got removed

Expected result:

  • renamed metrics are missing, server restarts without a problem

Prometheus: 0.2.1
OS: debian wheezy 7.0 amd64

Log output:

2014-04-11_19:41:46.86469 E0411 19:41:46.863897 21988 leveldb.go:214] Could not open storage: Corruption: 5 missing files; e.g.: /srv/prometheus/storage/label_name_and_value_pairs_by_fingerprint/000041.ldb
2014-04-11_19:41:46.86547 E0411 19:41:46.864869 21988 leveldb.go:214] Could not open storage: Corruption: 7 missing files; e.g.: /srv/prometheus/storage/fingerprints_by_label_name_and_value_pair/000180.ldb
2014-04-11_19:41:46.86611 E0411 19:41:46.865546 21988 leveldb.go:214] Could not open storage: Corruption: 4 missing files; e.g.: /srv/prometheus/storage/high_watermarks_by_fingerprint/001267.ldb
2014-04-11_19:41:46.86670 E0411 19:41:46.866173 21988 leveldb.go:214] Could not open storage: Corruption: 3 missing files; e.g.: /srv/prometheus/storage/curation_remarks/000079.ldb
2014-04-11_19:41:46.86731 E0411 19:41:46.866772 21988 leveldb.go:214] Could not open storage: Corruption: 1 missing files; e.g.: /srv/prometheus/storage/metric_membership_index/000033.ldb
2014-04-11_19:41:46.86741 E0411 19:41:46.867378 21988 leveldb.go:214] Could not open storage: Corruption: 5773 missing files; e.g.: /srv/prometheus/storage/samples_by_fingerprint/108948.ldb
2014-04-11_19:41:46.86816 F0411 19:41:46.868052 21988 main.go:206] Error opening storage: unable to open metric persistence
2014-04-11_19:41:46.86974 goroutine 1 [running]:
2014-04-11_19:41:46.87015 github.com/golang/glog.stacks(0xc21007e600, 0xc21002c240, 0x62, 0x107)
2014-04-11_19:41:46.87054       /home/julius/prometheus/.build/root/gopath/src/github.com/golang/glog/glog.go:726 +0xb1
2014-04-11_19:41:46.87104 github.com/golang/glog.(*loggingT).output(0x109f6e0, 0xc200000003, 0xc2100770c0)
2014-04-11_19:41:46.87153       /home/julius/prometheus/.build/root/gopath/src/github.com/golang/glog/glog.go:677 +0x1ff
2014-04-11_19:41:46.87204 github.com/golang/glog.(*loggingT).print(0x109f6e0, 0x7f0e00000003, 0x7f0ef3580f00, 0x2, 0x2)
2014-04-11_19:41:46.87942       /home/julius/prometheus/.build/root/gopath/src/github.com/golang/glog/glog.go:626 +0x12f
2014-04-11_19:41:46.87966 github.com/golang/glog.Fatal(0x7f0ef3580f00, 0x2, 0x2)
2014-04-11_19:41:46.88000       /home/julius/prometheus/.build/root/gopath/src/github.com/golang/glog/glog.go:1019 +0x50
2014-04-11_19:41:46.88040 main.main()
2014-04-11_19:41:46.88064       /home/julius/prometheus/main.go:206 +0x39d

@juliusv: I made a backup of the corrupted storage folder. Ping me for details.

@grobie grobie added bug labels Apr 11, 2014

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Apr 12, 2014

Wow, this is internal LevelDB corruption. My first guess would be that there was an unclean shutdown of some sorts (but even then, unrecoverable corruption is uncommon). Did you happen to look at the logs whether Prometheus shut down cleanly? In any case, this doesn't seem like a bug caused by us storing wrong data in LevelDB.

@grobie

This comment has been minimized.

Copy link
Member Author

grobie commented Apr 12, 2014

The log output including the previous shutdown info:

2014-04-11_19:30:44.23743 W0411 19:30:44.234495 04450 main.go:97] Received SIGINT/SIGTERM; Exiting gracefully...
2014-04-11_19:30:44.23745 I0411 19:30:44.234538 04450 targetmanager.go:120] Target manager exiting...
2014-04-11_19:30:44.23746 I0411 19:30:44.234555 04450 tiered.go:199] Triggering drain...
2014-04-11_19:30:44.23746 I0411 19:30:44.234569 04450 manager.go:95] Rule manager exiting...
2014-04-11_19:30:44.23747 I0411 19:30:44.234581 04450 targetpool.go:67] TargetPool exiting...
2014-04-11_19:30:44.23748 I0411 19:30:44.234603 04450 targetpool.go:67] TargetPool exiting...
2014-04-11_19:30:44.23749 I0411 19:30:44.234614 04450 targetpool.go:67] TargetPool exiting...
2014-04-11_19:30:44.23749 I0411 19:30:44.234625 04450 targetpool.go:67] TargetPool exiting...
2014-04-11_19:30:44.23750 I0411 19:30:44.234636 04450 targetmanager.go:120] Target manager exiting...
2014-04-11_19:30:44.23751 I0411 19:30:44.234646 04450 tiered.go:306] Flushing samples to disk...
2014-04-11_19:30:45.22758 I0411 19:30:45.227199 04450 tiered.go:317] Writing 765079 samples...
2014-04-11_19:31:11.45521 I0411 19:31:11.455120 04450 tiered.go:321] Done flushing.
2014-04-11_19:31:12.17591 prometheus, version 0.2.1 (branch: master, revision: 71d2ff4)
2014-04-11_19:31:12.17648   build user:       julius@[redacted]
2014-04-11_19:31:12.17678   build date:       20140326-14:51:19
2014-04-11_19:31:12.17699   go version:       1.2
2014-04-11_19:31:12.17717   leveldb version:  1.14.0
2014-04-11_19:31:12.17736   protobuf version: 2.5.0
2014-04-11_19:31:12.17754   snappy version:   1.1.0  
2014-04-11_19:31:12.96897 E0411 19:31:12.968176 14287 leveldb.go:214] Could not open storage: Corruption: 5 missing files; e.g.: /srv/prometheus/storage/label_name_and_value_pairs_by_fingerprint/000041.ldb  
2014-04-11_19:31:12.97086 E0411 19:31:12.970373 14287 leveldb.go:214] Could not open storage: Corruption: 7 missing files; e.g.: /srv/prometheus/storage/fingerprints_by_label_name_and_value_pair/000180.ldb
2014-04-11_19:31:12.97123 E0411 19:31:12.970391 14287 leveldb.go:214] Could not open storage: Corruption: 3 missing files; e.g.: /srv/prometheus/storage/curation_remarks/000079.ldb  
2014-04-11_19:31:12.97155 E0411 19:31:12.970400 14287 leveldb.go:214] Could not open storage: Corruption: 1 missing files; e.g.: /srv/prometheus/storage/metric_membership_index/000033.ldb  
2014-04-11_19:31:12.97202 E0411 19:31:12.970409 14287 leveldb.go:214] Could not open storage: Corruption: 4 missing files; e.g.: /srv/prometheus/storage/high_watermarks_by_fingerprint/001267.ldb  
2014-04-11_19:31:12.97252 E0411 19:31:12.970416 14287 leveldb.go:214] Could not open storage: Corruption: 5773 missing files; e.g.: /srv/prometheus/storage/samples_by_fingerprint/108948.ldb
2014-04-11_19:31:12.97299 F0411 19:31:12.970426 14287 main.go:206] Error opening storage: unable to open metric persistence
2014-04-11_19:31:12.97327 goroutine 1 [running]:
2014-04-11_19:31:12.97335 github.com/golang/glog.stacks(0xc2101be400, 0xc21002b7e0, 0x62, 0x107)
2014-04-11_19:31:12.97355       /home/julius/prometheus/.build/root/gopath/src/github.com/golang/glog/glog.go:726 +0xb1
2014-04-11_19:31:12.97381 github.com/golang/glog.(*loggingT).output(0x109f6e0, 0xc200000003, 0xc2100b0000)
2014-04-11_19:31:12.97405       /home/julius/prometheus/.build/root/gopath/src/github.com/golang/glog/glog.go:677 +0x1ff
2014-04-11_19:31:12.97431 github.com/golang/glog.(*loggingT).print(0x109f6e0, 0x7f1800000003, 0x7f18e4469f00, 0x2, 0x2)
2014-04-11_19:31:12.97460       /home/julius/prometheus/.build/root/gopath/src/github.com/golang/glog/glog.go:626 +0x12f
2014-04-11_19:31:12.97591 github.com/golang/glog.Fatal(0x7f18e4469f00, 0x2, 0x2)
2014-04-11_19:31:12.97593       /home/julius/prometheus/.build/root/gopath/src/github.com/golang/glog/glog.go:1019 +0x50
2014-04-11_19:31:12.97594 main.main()
2014-04-11_19:31:12.97594       /home/julius/prometheus/main.go:206 +0x39d
@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Dec 10, 2014

Closing this since the storage has been completely rewritten since.

@juliusv juliusv closed this Dec 10, 2014

simonpasquier pushed a commit to simonpasquier/prometheus that referenced this issue Oct 12, 2017

Merge pull request prometheus#388 from prometheus/box-color
Change codebox/toc color from red-ish to gray.
@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.