Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using queryRange with range-vector queries and start and end dates. #2378

Closed
freshteapot opened this Issue Jan 30, 2017 · 13 comments

Comments

Projects
None yet
3 participants
@freshteapot
Copy link

freshteapot commented Jan 30, 2017

When grafana or a 3rd party query prometheus via query_range:

/api/v1/query_range

queryRange

it is allowed to pass aside from the query:

  • start
  • end
  • step

Following the code a little, I wonder, if it should check range queries to see if they satisfy the start end and if not alter the "range duration".

For the query below, this works in prometheus /graph, however it would be great when used via 'query_range' that [1m] would be changed to reflect the end - start.

sum(
    count(
        rate(vsr_unique_id[1m])
    ) by (unique_id)
)

Not sure where it would be best to manipulate this (if that is desired).

queryRange
NewRangeQuery

How did I get here

I am trying to build a top X based on uniqueness.

sum(
    count(
        rate(vsr_unique_id[1m])
    ) by (unique_id)
)

This query works great via prometheus / graph, but when used in combination with grafana, the data is wrong, because of the use of "time range" (Grafana). However, after I looked into the api requests, I wonder if the solution should be on the prometheus side not the 3rd party.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 30, 2017

This is something you need to do in the calling code. Look at the interval variable in Grafana.

@freshteapot

This comment has been minimized.

Copy link
Author

freshteapot commented Jan 30, 2017

@brian-brazil Thank you for the quick response, I think I tried this, via setting "$interval" via templating, with "auto" enabled.

However with or without auto enabled. When I use a large interval and "time range" I end up with an error about "Multiple Series Error", and can see results coming back with null. Via "value mapping" we did try setting null -> 0, but no joy.

Query in use

sum(
    count(
        rate(vsr_unique_id[$interval])
    ) by (unique_id)
)

Could this be a sign, I am trying to use prometheus wrong?

@freshteapot

This comment has been minimized.

Copy link
Author

freshteapot commented Jan 30, 2017

I stand corrected, using "sum" method reduces the data coming back, but I feel I need to tweak my "auto settings" better.

@freshteapot

This comment has been minimized.

Copy link
Author

freshteapot commented Jan 30, 2017

It works and it doesnt. When I give it date ranges, I know had data "added", it works. However it also shows data from time periods where data was not added.

An example:

If I visit a url on monday 1 time, and then on wedensday 1 time with a the same user on both times.

If I query across sunday -> thursday it works and says 1 unique user.
If I query across tuesday morning -> tuesday evening it shows 1 unique user, which is wrong for that time period.

@freshteapot

This comment has been minimized.

Copy link
Author

freshteapot commented Jan 31, 2017

In issue 1334

The fundamental problem is that PromQL has no notion of that fixed point in the past.
It's all relative and sliding window. Perhaps somebody else has an idea, 
but I believe that we needed to introduce a fundamentally new concept 
into PromQL to satisfy your request.

I think, if the above is correct, then what I am trying todo needs to have some way of resetting my counter / gauge on the collecting side. As otherwise the sliding window will keep adding the result.

Do you agree? :)

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 31, 2017

I think you might be over-complicating this. What exactly is it you're trying to do at a high level?

@freshteapot

This comment has been minimized.

Copy link
Author

freshteapot commented Jan 31, 2017

We record the "unique_id" of a user in a label per request, currently via a counter.

vsr_unique_id{respStatus="200",unique_id="a123456789012456789"} 3

Once in prometheus, we are querying to get the "total unique users" over time.

sum(
    count(
        rate(vsr_unique_id[1m])
    ) by (unique_id)
)

When changing the time, it causes unexpected results.

@freshteapot

This comment has been minimized.

Copy link
Author

freshteapot commented Jan 31, 2017

If it helps, this is against "singlestat".

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 31, 2017

First off this is something that Prometheus really isn't suited to, a high cardinality field like user id is better handled by event logging.

sum(rate(my_var[1d]) > bool 0) should do it, set range as needed.

@freshteapot

This comment has been minimized.

Copy link
Author

freshteapot commented Jan 31, 2017

Thank you @brian-brazil for the openness about what "Prometheus" is not suited for. Great tip on using bool, I didnt know I could do that.

We are trying a slightly ugly concept (which has flaws!)

Currently, trying to wrap the "promhttp.Handler()" and then clean specific gauges.

metrics, _ := prometheus.DefaultGatherer.Gather()
for _, metric := range metrics {
    if metric.GetName() == "vsr_unique_id" {
        for _, modelMetric := range metric.GetMetric() {
            gauge := modelMetric.GetGauge()
            if gauge != nil {
                label := modelMetric.GetLabel()
                if metric.GetName() == "vsr_unique_id" {
                    uniqueID.With(prometheus.Labels{
                        label[0].GetName(): label[0].GetValue(),
                        label[1].GetName(): label[1].GetValue(),
                    }).Set(0)
                }
            }
        }
    }
}

We will see how good or bad this idea will be tomorrow.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 31, 2017

I'm not sure what that's trying to do, but I don't think it's going to work.

@freshteapot

This comment has been minimized.

Copy link
Author

freshteapot commented Jan 31, 2017

It is trying to, reset the "unique gauges" back to 0, in the handler that handles when a request for "/metrics" is made,

Mostly to explore, if it helps with graphing, it certainly is far from ideal! Not to mention very easy to break.

I am not disagreeing with you on "I don't think it's going to work".

@grobie grobie closed this Mar 6, 2017

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.