New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vmalert: long query execution in annotations blocks API #6079
Comments
Looks like vmalert is get blocked here VictoriaMetrics/app/vmalert/rule/alerting.go Lines 97 to 120 in 83216e9
When /metrics endpoint is called, metrics package invokes all Gauge functions to collect the data. And in vmalert's case, those functions will try to acquire the lock to read the data. At the same time, the same AlertingRule object could have already acquired the lock in Exec function. Here the lock is acquired to update the object fields. And one of these fields is One solution here is to make Gauge metrics use atomic values instead of acquiring the lock. This should make the metrics collection always independent. |
|
@dmitryk-dk #6129 has been merged and will be included into the next release. |
FYI, this bugfix has been included in v1.97.4 LTS release. It will be also included in the upcoming latest non-LTS release. |
vmalert won't block /api/v1/rules, /api/v1/alerts, /metrics APIs during rules evaluation starting from v1.101.0 release. |
Describe the bug
If you are using the
query
template function in the alerting rule annotations and thisquery
executes for a long time (let's say 1 minute), the other vmalert endpoint does not return a response.To Reproduce
create alerting group like in the example
Prepare nginx config
run nginx
Run vmalert
and try to curl
/metrics
endpointThe request will stack, while
query
from the template will be executed,curl
will wait for the response.I think it happens because of this mutex
VictoriaMetrics/app/vmalert/rule/alerting.go
Line 389 in 83216e9
Version
vmalert v1.99.0
Logs
No response
Screenshots
No response
Used command-line flags
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: