Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upPromql won't time-out queries in matrixSelector #4288
Comments
jacksontj
added a commit
to jacksontj/prometheus
that referenced
this issue
Jun 19, 2018
This comment has been minimized.
This comment has been minimized.
|
Thanks for the report. I'm aware of this, and it's something I intend to cleanup now that the big PromQL change is in. |
brian-brazil
added
kind/bug
component/promql
labels
Jun 19, 2018
jacksontj
referenced this issue
Jun 19, 2018
Closed
Promql won't time-out queries in expandSeriesSet #4289
This comment has been minimized.
This comment has been minimized.
|
@brian-brazil Something like 29294b2? Or probably you were already thinking of something more holistic? |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@jacksontj Have you verified somehow that the bulk of the time (and memory) here is spent after the expansion of the series set for the matrix selector ( Line 511 in 404abe0 |
This comment has been minimized.
This comment has been minimized.
|
(of course there should still be timeout handling in the other pieces of the engine) |
This comment has been minimized.
This comment has been minimized.
|
In the example I have both spots are very time intensive (at least on 2.2). Meaning it doesn't time out in either place (I can see it spend lots of time in both). I'm verifying master right now on my test host -- should have an answer in ~5-10m (prom takes FOREVER to start/stop) |
This comment has been minimized.
This comment has been minimized.
|
Yup, makes sense.
Whoa, how come? |
This comment has been minimized.
This comment has been minimized.
|
From the little looking I've done its TSDB-- we have ~5TB of data on this box. I'll probably spend some time looking into that in the future, but for another issue/pr :) |
This comment has been minimized.
This comment has been minimized.
|
Oh yeah, I would assume it's the TSDB, but which part would be interesting. Anyway, let's leave that one for another issue :) |
This comment has been minimized.
This comment has been minimized.
No, it should be in the range vector function evaluation. There's math to be done to see how often we're required to do this, and it turns out due to edge cases (really long durations) to be at every time series. |
jacksontj
referenced this issue
Jun 21, 2018
Merged
Check for timeout in each iteration of matrixSelector #4300
jacksontj
added a commit
to jacksontj/prometheus
that referenced
this issue
Jun 21, 2018
brian-brazil
closed this
in
#4300
Jun 21, 2018
brian-brazil
added a commit
that referenced
this issue
Jun 21, 2018
jacksontj
added a commit
to jacksontj/prometheus
that referenced
this issue
Jun 28, 2018
brian-brazil
added a commit
that referenced
this issue
Jul 11, 2018
mknapphrt
added a commit
to mknapphrt/prometheus
that referenced
this issue
Jul 26, 2018
gouthamve
added a commit
to gouthamve/prometheus
that referenced
this issue
Aug 1, 2018
bobmshannon
pushed a commit
to bobmshannon/prometheus
that referenced
this issue
Nov 19, 2018
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 22, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
jacksontj commentedJun 19, 2018
•
edited
Bug Report
What did you do?
Someone sent a large query such as
{__name__=~".+"}[5d]to our prometheus host.What did you expect to see?
I expect the query to timeout before completing, while consuming some memory and CPU along the way.
What did you see instead? Under which circumstances?
The query runs until it exhausts all available memory on the host and OOM killer get is. I have set it up in a repro env locally and the query has been running for ~60m (its still going) even though I have the timeout set to the default 2m.
Environment
2.2