Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: instrument the query layer to track rate-limited queries #3894

Merged

Conversation

jtlisi
Copy link
Contributor

@jtlisi jtlisi commented Mar 2, 2021

What this PR does:

Instrument the query-frontend and query-scheduler to track the queries that were discarded because of the configured outstanding queries rate limiter. I pre-emptively added a reason label to allow for flexibility in adding other reasons for why a query may be discarded.

Checklist

  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have metric to monitor queue length per user, but that doesn't tell us max queue length, so I can see how this new metric could be useful.

I'd suggest to start without "reason" label – we can always add it later, when needed.

@@ -91,6 +95,7 @@ func (q *RequestQueue) EnqueueRequest(userID string, req Request, maxQueriers in
}
return nil
default:
q.discardedQueries.WithLabelValues(userID, validation.RateLimited).Inc()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "rate limited" is misleading here. We only drop the request if queue is full, which can be for various reasons (slow queries, queriers crashed, no querier connected or all queriers are busy). I'd suggest to start without "reason" label.

Also I don't think we should use "validation" reasons (used for validating incoming samples) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

pkg/frontend/v1/frontend.go Outdated Show resolved Hide resolved
@@ -100,7 +101,12 @@ func NewScheduler(cfg Config, limits Limits, log log.Logger, registerer promethe
Name: "cortex_query_scheduler_queue_length",
Help: "Number of queries in the queue.",
}, []string{"user"})
s.requestQueue = queue.NewRequestQueue(cfg.MaxOutstandingPerTenant, s.queueLength)

s.discardedQueries = promauto.With(registerer).NewCounterVec(prometheus.CounterOpts{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please cleanup this metric in cleanupMetricsForInactiveUser and add it to tests to verify that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you.

CHANGELOG.md Show resolved Hide resolved
Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
@jtlisi jtlisi merged commit 03d4318 into cortexproject:master Mar 2, 2021
@jtlisi jtlisi deleted the 20210301_instrument_discarded_queries branch March 2, 2021 22:02
roystchiang pushed a commit to roystchiang/cortex that referenced this pull request Apr 6, 2022
…xproject#3894)

* feat: instrument the query layer to track rate-limited queries

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* chore: update changelog

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* fix: fix goimports linting error

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* fix per PR comments

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* rename discarded_queries --> discarded_requests

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* fix lint

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants