Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upFingerprint collision handling is O(n^2) in the number of mappings #1513
Comments
brian-brazil
added
the
bug
label
Mar 30, 2016
This comment has been minimized.
This comment has been minimized.
|
oops |
This comment has been minimized.
This comment has been minimized.
|
This case was with synthetic data for a loadtest, so more collisions and at a higher rate than usual. We might away with just the coalescing, which'd also avoid a data format change. |
This comment has been minimized.
This comment has been minimized.
|
Ah, I'd missed how the fingerprintlocker is implemented. This will hold up all scraping as that works with locks for 1024 buckets rather than one per timeseries. |
beorn7
referenced this issue
Apr 14, 2016
Merged
Checkpoint fingerprint mappings only upon shutdown #1555
beorn7
closed this
in
#1555
Apr 14, 2016
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 24, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
lock
bot
locked and limited conversation to collaborators
Mar 24, 2019
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
brian-brazil commentedMar 30, 2016
The fingerprint mappings write out a checkpoint at each mapping, syncing it to disk (
prometheus/storage/local/persistence.go
Line 1352 in e83f05f
This is done while the raw fingerprint is locked (
prometheus/storage/local/storage.go
Line 589 in e83f05f
If you've many new collisions, this can block scraping for quite a while. The example I'm looking at seems to have tens of thousands of collisions which took an hour to work through before scraping worked again.
I recommend switching to appending new mappings rather than writing out a whole new file. We should also look at coalescing the write() and syncs() together, otherwise we'll be bottlenecked there too.