feat: instrument the query layer to track rate-limited queries #3894

jtlisi · 2021-03-02T04:16:54Z

What this PR does:

Instrument the query-frontend and query-scheduler to track the queries that were discarded because of the configured outstanding queries rate limiter. I pre-emptively added a reason label to allow for flexibility in adding other reasons for why a query may be discarded.

Checklist

CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

pstibrany

We already have metric to monitor queue length per user, but that doesn't tell us max queue length, so I can see how this new metric could be useful.

I'd suggest to start without "reason" label – we can always add it later, when needed.

pstibrany · 2021-03-02T07:11:39Z

pkg/scheduler/queue/queue.go

@@ -91,6 +95,7 @@ func (q *RequestQueue) EnqueueRequest(userID string, req Request, maxQueriers in
 		}
 		return nil
 	default:
+		q.discardedQueries.WithLabelValues(userID, validation.RateLimited).Inc()


I think "rate limited" is misleading here. We only drop the request if queue is full, which can be for various reasons (slow queries, queriers crashed, no querier connected or all queriers are busy). I'd suggest to start without "reason" label.

Also I don't think we should use "validation" reasons (used for validating incoming samples) here.

Makes sense.

pkg/frontend/v1/frontend.go

pstibrany · 2021-03-02T07:12:52Z

pkg/scheduler/scheduler.go

@@ -100,7 +101,12 @@ func NewScheduler(cfg Config, limits Limits, log log.Logger, registerer promethe
 		Name: "cortex_query_scheduler_queue_length",
 		Help: "Number of queries in the queue.",
 	}, []string{"user"})
-	s.requestQueue = queue.NewRequestQueue(cfg.MaxOutstandingPerTenant, s.queueLength)
+
+	s.discardedQueries = promauto.With(registerer).NewCounterVec(prometheus.CounterOpts{


Please cleanup this metric in cleanupMetricsForInactiveUser and add it to tests to verify that.

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

pstibrany

LGTM, thank you.

CHANGELOG.md

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

…xproject#3894) * feat: instrument the query layer to track rate-limited queries Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> * chore: update changelog Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> * fix: fix goimports linting error Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> * fix per PR comments Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> * rename discarded_queries --> discarded_requests Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> * fix lint Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

feat: instrument the query layer to track rate-limited queries

78d9845

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

pull-request-size bot added the size/M label Mar 2, 2021

jtlisi added 2 commits March 1, 2021 23:19

chore: update changelog

3c3486d

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

fix: fix goimports linting error

c90f62b

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

pstibrany reviewed Mar 2, 2021

View reviewed changes

fix per PR comments

1af2c30

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

pstibrany approved these changes Mar 2, 2021

View reviewed changes

CHANGELOG.md Show resolved Hide resolved

jtlisi added 2 commits March 2, 2021 14:39

rename discarded_queries --> discarded_requests

fa5d96a

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

fix lint

f6b1ff3

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

jtlisi merged commit 03d4318 into cortexproject:master Mar 2, 2021

jtlisi deleted the 20210301_instrument_discarded_queries branch March 2, 2021 22:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: instrument the query layer to track rate-limited queries #3894

feat: instrument the query layer to track rate-limited queries #3894

jtlisi commented Mar 2, 2021

pstibrany left a comment

pstibrany Mar 2, 2021

jtlisi Mar 2, 2021

pstibrany Mar 2, 2021

jtlisi Mar 2, 2021

pstibrany left a comment

feat: instrument the query layer to track rate-limited queries #3894

feat: instrument the query layer to track rate-limited queries #3894

Conversation

jtlisi commented Mar 2, 2021

pstibrany left a comment

Choose a reason for hiding this comment

pstibrany Mar 2, 2021

Choose a reason for hiding this comment

jtlisi Mar 2, 2021

Choose a reason for hiding this comment

pstibrany Mar 2, 2021

Choose a reason for hiding this comment

jtlisi Mar 2, 2021

Choose a reason for hiding this comment

pstibrany left a comment

Choose a reason for hiding this comment