New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
applying extra_filters
might hit maxUniqueTimeSeries
limit for label_values, labels, series API requests
#5055
Comments
extra_filters
might hit maxUniqueTimeSeries
limit for label_values, labels, series API requests
for extra_filters and extra_label for /labels and /label/{}/values it must prevent errors with search.Max exceed error it's possible with scanning (date,tag) -> metricIDs index and joining `metricName` for each found `metricID`. Those `metricName` can filtered with tagFilters. #5055
…hen performing /api/v1/labels and /api/v1/label/.../values requests This limit has little sense for these APIs, since: - Thses APIs frequently result in scanning of all the time series on the given time range. For example, if extra_filters={datacenter="some_dc"} . - Users expect these APIs shouldn't hit the -search.maxUniqueTimeseries limit, which is intended for limiting resource usage at /api/v1/query and /api/v1/query_range requests. Also limit the concurrency for /api/v1/labels, /api/v1/label/.../values and /api/v1/series requests in order to limit the maximum memory usage and CPU usage for these API. This limit shouldn't affect typical use cases for these APIs: - Grafana dashboard load when dashboard labels should be loaded - Auto-suggestion list load when editing the query in Grafana or vmui Updates #5055
…hen performing /api/v1/labels and /api/v1/label/.../values requests This limit has little sense for these APIs, since: - Thses APIs frequently result in scanning of all the time series on the given time range. For example, if extra_filters={datacenter="some_dc"} . - Users expect these APIs shouldn't hit the -search.maxUniqueTimeseries limit, which is intended for limiting resource usage at /api/v1/query and /api/v1/query_range requests. Also limit the concurrency for /api/v1/labels, /api/v1/label/.../values and /api/v1/series requests in order to limit the maximum memory usage and CPU usage for these API. This limit shouldn't affect typical use cases for these APIs: - Grafana dashboard load when dashboard labels should be loaded - Auto-suggestion list load when editing the query in Grafana or vmui Updates #5055
FYI, the commit 5d66ee8 removes checking for the |
VictoriaMetrics doesn't use |
I have doubts about current design for By default, those API uses an index for searching results. It allows to perform fast and efficient search operations. Current index has two versions:
It's possible to use two different join strategies:
First approach has inconsistent performance. It may work really fast for some cases and incredibly slow for others. Good case for it, when you have a small churn rate at match patterns for search requests. E.g. search for Second approach has consistently slow performance and it depends on the number of filters and matched metric_ids. Other problem with it, that it's not scalable. With more than 100M active series request will be constantly timeout. But what's reason behind using this api enchantment? Mostly, it's a replacement for current tenant implementation. And instead of tenants, it's possible to use labels as a tenant id. It looks, that the efficient way to implement that - provide a custom index creation feature for tags. It allows to perform fast search operation with low resource usage. E.g. -customTagIndexLabels="env,team". It must create following indexes for tags: It makes request with extra_labels - It doesn't allow to make "cross-tenant" requests, aka using regexp at extra_labels or extra_filters. |
…PISeries options for fine-tuning CPU and RAM usage for /api/v1/series , /api/v1/labels and /api/v1/label/.../values This commit returns back limits for these endpoints, which have been removed at 5d66ee8 , since it has been appeared that missing limits result in high CPU usage, while the introduced concurrency limiter results in failed lightweight requests to these endpoints because of timeout when heavyweight requests are executed. Updates #5055
…PISeries options for fine-tuning CPU and RAM usage for /api/v1/series , /api/v1/labels and /api/v1/label/.../values This commit returns back limits for these endpoints, which have been removed at 5d66ee8 , since it has been appeared that missing limits result in high CPU usage, while the introduced concurrency limiter results in failed lightweight requests to these endpoints because of timeout when heavyweight requests are executed. Updates #5055
FY, the v1.97.3 LTS release takes into account |
FYI, commits fab02fa and 0b7a23a add These commits will be included in the next release. |
There was a bug in |
FYI, VictoriaMetrics doesn't use
See these docs for more details. Closing the issue as addressed. |
Additional information, which could be useful for users, who encounter slow queries to |
…rching for matching time series at /api/v1/labels, /api/v1/label/.../values and /api/v1/status/tsdb This should improve query performance when match[], extra_filters[] or extra_label args are passed to these APIs Updates #5055
…ers into indexSearch.searchMetricIDsInternal This makes the code less fragile - it is harder to skip the convertToCompositeTagFilterss() call now. While at it, call indexSearch.containsTimeRange() inside indexSearch.searchMetricIDsInternal() in order to quickly terminate search of time series in the old indexdb for new time ranges. Updates #5055 This is a follow-up for 2d31fd7
…rching for matching time series at /api/v1/labels, /api/v1/label/.../values and /api/v1/status/tsdb This should improve query performance when match[], extra_filters[] or extra_label args are passed to these APIs Updates #5055
…ers into indexSearch.searchMetricIDsInternal This makes the code less fragile - it is harder to skip the convertToCompositeTagFilterss() call now. While at it, call indexSearch.containsTimeRange() inside indexSearch.searchMetricIDsInternal() in order to quickly terminate search of time series in the old indexdb for new time ranges. Updates #5055 This is a follow-up for 2d31fd7
FYI, the commits 2d31fd7 and d1d2771 improve performance for |
…rching for matching time series at /api/v1/labels, /api/v1/label/.../values and /api/v1/status/tsdb This should improve query performance when match[], extra_filters[] or extra_label args are passed to these APIs Updates VictoriaMetrics#5055
…ers into indexSearch.searchMetricIDsInternal This makes the code less fragile - it is harder to skip the convertToCompositeTagFilterss() call now. While at it, call indexSearch.containsTimeRange() inside indexSearch.searchMetricIDsInternal() in order to quickly terminate search of time series in the old indexdb for new time ranges. Updates VictoriaMetrics#5055 This is a follow-up for 2d31fd7
…rching for matching time series at /api/v1/labels, /api/v1/label/.../values and /api/v1/status/tsdb This should improve query performance when match[], extra_filters[] or extra_label args are passed to these APIs Updates VictoriaMetrics#5055
…ers into indexSearch.searchMetricIDsInternal This makes the code less fragile - it is harder to skip the convertToCompositeTagFilterss() call now. While at it, call indexSearch.containsTimeRange() inside indexSearch.searchMetricIDsInternal() in order to quickly terminate search of time series in the old indexdb for new time ranges. Updates VictoriaMetrics#5055 This is a follow-up for 2d31fd7
…n match[] contains metric name Updates VictoriaMetrics#2978 Updates VictoriaMetrics#5055
…rching for matching time series at /api/v1/labels, /api/v1/label/.../values and /api/v1/status/tsdb This should improve query performance when match[], extra_filters[] or extra_label args are passed to these APIs Updates VictoriaMetrics#5055
…ers into indexSearch.searchMetricIDsInternal This makes the code less fragile - it is harder to skip the convertToCompositeTagFilterss() call now. While at it, call indexSearch.containsTimeRange() inside indexSearch.searchMetricIDsInternal() in order to quickly terminate search of time series in the old indexdb for new time ranges. Updates VictoriaMetrics#5055 This is a follow-up for 2d31fd7
…n match[] contains metric name Updates VictoriaMetrics#2978 Updates VictoriaMetrics#5055
FYI, the performance of |
Is your feature request related to a problem? Please describe
Requests for /api/v1/label/name/values and other APIs are processed by different code path if
extra_filters
is set.Without
extra_filters
, VM simply selects first N results from index search.maxUniqueSeries
limit isn't involved.With
extra_filters
VM performs metricIDs search first and usesmaxUniqueSeries
limit during the search.The problem is that user can have
/api/v1/label/name/values
API call executed without any issues, but once they addextra_filters
the request can breach the complexity limit. This is controversial, as it is expected that specifying additional filters always reduces the complexity of the query.Describe the solution you'd like
It'd be great to implement different approach.
It makes sense only if limit LTE 1000.
Describe alternatives you've considered
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: