Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low-resolution querying may drop long time ranges #2776

Closed
fabxc opened this Issue May 26, 2017 · 10 comments

Comments

Projects
None yet
3 participants
@fabxc
Copy link
Member

fabxc commented May 26, 2017

When querying Prometheus with "low" resolution (default is often low enough), minute long time ranges are often not shown. Those seem to correlate exactly with blocks, but not confirmed yet.

Setting an explicit low query resolution makes the expected data reappear in the results.

@fabxc fabxc added the dev-2.0 label May 26, 2017

@fabxc

This comment has been minimized.

Copy link
Member Author

fabxc commented May 26, 2017

I'm seeing this pretty consistently while @gouthamve cannot reproduce this. Would be good to know whether other's can. @mwitkow @grobie @beorn7

@mwitkow

This comment has been minimized.

Copy link
Contributor

mwitkow commented May 27, 2017

@fabxc I'm not sure what you mean by low resolution? A narrow time window?
Can you give an example of a /query param to try reproduce?

@fabxc

This comment has been minimized.

Copy link
Member Author

fabxc commented May 29, 2017

Just range queries with a coarse-ish step size.

@mwitkow

This comment has been minimized.

Copy link
Contributor

mwitkow commented May 30, 2017

Can confirm. Querying for 5m intervals work, but anything like 1m or 30s rates show holes in data.

Take a look at this. There is no gap gap around 11:00 when using 5m rate
image

vs the massive gap in served responses (the cursor "skips oveer") around 11:0 when using 30s

image

@fabxc fabxc added the kind/bug label May 30, 2017

@fabxc

This comment has been minimized.

Copy link
Member Author

fabxc commented May 30, 2017

Thanks for confirming – now to find out where it all went wrong.

@gouthamve

This comment has been minimized.

Copy link
Member

gouthamve commented Jun 7, 2017

I can reproduce this locally now, some early findings:

  1. Low resolution is not the cause of it: Low res vs High res
  2. The same resolution over different time ranges can cause it: shorter range vs longer range

And the worst part is, in 2) above, the break happened at 1800, and the block spans from 1315 to 2000, so this is definitely not due to block boundaries. Digging further.

@gouthamve

This comment has been minimized.

Copy link
Member

gouthamve commented Jun 7, 2017

This is caused because of this commit by me. When we cut a chunk, we are not setting a mint and it is defaulting to 0.

Later we are Seeking to the right chunk using the chunk's MinTime which makes us seek to wrong chunks and skip some chunks. Fixing the mintime when cutting a chunk should fix this IMO.

@fabxc

This comment has been minimized.

Copy link
Member Author

fabxc commented Jun 7, 2017

Chunk's min timestamp is fixed as part of prometheus/tsdb#94. Will check whether I can still produce the issue.

@mwitkow

This comment has been minimized.

Copy link
Contributor

mwitkow commented Jun 13, 2017

Can't reproduce it from our side with today's dev-2.0 cut so 🍾

@fabxc fabxc closed this Jul 3, 2017

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.