Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't do the same fingerprint->metric lookups from disk multiple times per query #292

Closed
juliusv opened this Issue Jun 11, 2013 · 6 comments

Comments

Projects
None yet
2 participants
@juliusv
Copy link
Member

juliusv commented Jun 11, 2013

Now that view queue wait times and disk extraction are optimized, getting values from the built memory view is increasingly becoming a bottleneck. Profile this and consider creating reusable array iterators for memory series instead of doing binary search on the whole series array for every point. (since we're usually stepping through linearly in queries)

@ghost ghost assigned juliusv Jun 11, 2013

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Jun 17, 2013

Did some timing on a query which reported 15s spent in viewAdapter.GetValueAtTime(). Turns out, the time is not spent calling view.GetValueAtTime(), but in storage.GetMetricForFingerprint(). We should only fetch this information once per timeseries and query, not for every data point on a timeseries.

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Jun 18, 2013

So it turns out that doing GetMetricForFingerprint() is usually really fast, if the timeseries is in memory. If however, there are timeseries that are not in memory yet/anymore, but still within the query's staleness limit, then we go to disk for looking up the fingerprint. This can happen in two cases:

  • Prometheus has recently restarted and doesn't have current TSen in memory yet
  • there are TSen that are just really stale and won't be updated into memory again, but a query might still ask for them if the query goes back far enough.

Proposal: when memory lookups for GetMetricForFingerprint() fail once and we go to disk, we should then populate the memory storage with the retrieved info. Garbage-collection of these empty memory series could then happen together with flushing from memory to disk. @matttproud

@matttproud

This comment has been minimized.

Copy link
Member

matttproud commented Jun 18, 2013

The latter proposal sounds fine.
Am 18.06.2013 10:21 schrieb "juliusv" notifications@github.com:

So it turns out that doing GetMetricForFingerprint() is usually really
fast, if the timeseries is in memory. If however, there are timeseries that
are not in memory yet/anymore, but still within the query's staleness
limit, then we go to disk for looking up the fingerprint. This can happen
in two cases:

  • Prometheus has recently restarted and doesn't have current TSen in
    memory yet
  • there are TSen that are just really stale and won't be updated into
    memory again, but a query might still ask for them if the query goes back
    far enough.

Proposal: when memory lookups for GetMetricForFingerprint() fail once and
we go to disk, we should then populate the memory storage with the
retrieved info. Garbage-collection of these empty memory series could then
happen together with flushing from memory to disk. @matttproudhttps://github.com/matttproud


Reply to this email directly or view it on GitHubhttps://github.com//issues/292#issuecomment-19599916
.

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Jun 18, 2013

Hmm... the memory series storage is not super ideal just to cache metric fingerprints, but it'll do for now. For each empty timeseries that we put into it just for fp->metric lookups, we also need to store the indexes pointing at it, even if that TS is never written to again (but just in case). All the more reason to expunge unused timeseries upon flush to disk.

@juliusv

This comment has been minimized.

Copy link
Member Author

juliusv commented Jun 28, 2013

Done in 71199e2

@juliusv juliusv closed this Jun 28, 2013

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 25, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 25, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.