You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently silence metric collection happens during scrape time. In scenarios where AlertManager is under heavy load, lock contention can occur and causes high latency in scraping. One such scenario is when there are lots of aggregation groups and new silences are being added
Would it be acceptable to collect silences count in the background instead of collecting it at the time of scraping? Doing so reduces latency in scraping by removing lock contention at the time of scraping. Lock contention can still occur in the Goroutine.
I'm more interested to see if this can be made faster instead of offloading it to a goroutine. There is a comment in CountState:
// This could probably be optimized.
Perhaps first look at how to make it faster? The lock still has been acquired, so I would assume under very heavy load, you're just scraping stale silence metrics.
I'm more interested to see if this can be made faster instead of offloading it to a goroutine. There is a comment in CountState:
// This could probably be optimized.
Perhaps first look at how to make it faster? The lock still has been acquired, so I would assume under very heavy load, you're just scraping stale silence metrics.
Can investigate if there are any improvements that could be made. Just want to note that counting silences holds up collection of other metrics too
Uh oh!
There was an error while loading. Please reload this page.
Currently silence metric collection happens during scrape time. In scenarios where AlertManager is under heavy load, lock contention can occur and causes high latency in scraping. One such scenario is when there are lots of aggregation groups and new silences are being added
Would it be acceptable to collect silences count in the background instead of collecting it at the time of scraping? Doing so reduces latency in scraping by removing lock contention at the time of scraping. Lock contention can still occur in the Goroutine.
Profile captured during high latency in scraping
PR to collect silence counts in a separate goroutine
The text was updated successfully, but these errors were encountered: