Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First entry for each fingerprint in leveldb is lost #385

Closed
beorn7 opened this Issue Mar 27, 2014 · 5 comments

Comments

Projects
None yet
3 participants
@beorn7
Copy link
Member

beorn7 commented Mar 27, 2014

If a Prometheus server is started from scratch, the first entry for each fingerprint in the leveldb is either never hitting the leveldb or is later not read.

The former is more likely, as the 'dumper' tool doesn't show the missing entry, either. If the data is actually somewhere in the leveldb, then something in the read path must be severely broken (as the 'dumper' tool can't find the missing entry, either).

The effect in practice is that every time series has the first 500 entries in the lifetime of the Prometheus server missing.

@beorn7 beorn7 added the bug label Mar 27, 2014

@matttproud

This comment has been minimized.

Copy link
Member

matttproud commented Mar 27, 2014

A couple of ideas:

  1. It wouldn't take too long to build a more low-level dumper with the C++
    bindings, but then you would be barely any lower level than what Levigo
    (C++ LevelDB's C exports wrapped in very little Go code) provides.
  2. You may want to validate whether this occurs if the compaction curation
    mechanism is disabled on a fresh data set. That is one part of codebase
    that is unequivocally the worst.

#1 would help you rule out anomalies in LevelDB, which we know is good but
not infallible. #2 seems most likely, but it would be hair-pullingly
difficult to validate this without ruling out #1.

2014-03-27 16:00 GMT+01:00 Björn Rabenstein notifications@github.com:

If a Prometheus server is started from scratch, the first entry for each
fingerprint in the leveldb is either never hitting the leveldb or is later
not read.

The former is more likely, as the 'dumper' tool doesn't show the missing
entry, either. If the data is actually somewhere in the leveldb, then
something in the read path must be severely broken (as the 'dumper' tool
can't find the missing entry, either).

The effect in practice is that every time series has the first 500 entries
in the lifetime of the Prometheus server missing.


Reply to this email directly or view it on GitHubhttps://github.com//issues/385
.

Key: 0xC42234343FDE43A8
Fingerprint: 6945 9248 4362 68CB EA83 F1EA C422 3434 3FDE 43A8

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Mar 27, 2014

Yeah, I would also look at the compaction/deletion processors as the most likely culprits.

@beorn7

This comment has been minimized.

Copy link
Member Author

beorn7 commented Mar 27, 2014

Thanks for the feedback... Actually, I filed the issue mainly to keep it on file (when I realized there is no quick fix). Quite possible we'll get rid of leveldb first (but that's a different story... ;)

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Dec 10, 2014

The old storage is gone. Closing this.

@juliusv juliusv closed this Dec 10, 2014

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.