New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Series limit applied at vmagent, but churn rate is high #3660
Comments
Hello! Can confirm the issue. |
There are two changes here: 1. Do not account for `sw.Config.NoStaleMarkers`. Otherwise, disabling staleness markers would also mean disabling of `seriesLimiter`; 2. Prevent sending staleness markers if series limit has been exceeded. To send staleness markers we need to check which series disappeared between current and previous scrapes. But when series limit is dropping series - there is no an easy way to calculate it anymore. Hence, wrong markers could be send to remote storage. Although, each series which was rejected by `seriesLimiter` will be then accounted as a new time series by VM TSDB, after it was received in a form of stale marker. See #3660 Signed-off-by: hagen1778 <roman@victoriametrics.com>
Fix the following issues: - Series limit wasn't applied when staleness tracking was disabled. - Series limit didn't prevent from sending staleness markers for new series exceeding the limit. Updates #3660 Thanks to @hagen1778 for the initial attempt to fix the issue at #3665
Fix the following issues: - Series limit wasn't applied when staleness tracking was disabled. - Series limit didn't prevent from sending staleness markers for new series exceeding the limit. Updates #3660 Thanks to @hagen1778 for the initial attempt to fix the issue at #3665
@blesswinsamuel , thanks for filing the detailed bugreport! The issue should be fixed in the commit 289af65 . This commit will be included in the next release. In the mean time you can build |
@valyala @hagen1778 Thanks for fixing this so fast! I built vmagent from the commit a844b97, and it is working as expected. Thank you! One more thing - memory usage is going up a lot when running avalanche with the same configuration (total 15k series that keeps changing every minute). I can raise a separate issue to track this if you'd like. I have streamParse enabled, which according to the docs, should help when targets export a big number of metrics. Unfortunately, I cleared the data in victoriametrics before trying out the new vmagent, so I don't have the memory usage details before the update. |
vmagent can require more memory than usual if To verify this please capture the memory profile and attach to the issue. Yes, creating a new issue for memory usage is preferable. |
@hagen1778 thanks for your response. I created a new issue #3675 with more details about the memory spike. After doing some tests, it looks like this memory spike happens when the target exposes a large number of new series it hasn't seen previously on every scrape. |
|
…markers … (#5577)" This reverts commit cfec258. Reason for revert: the original code already doesn't store the last scrape response when stale markers are disabled. The scrapeWork.areIdenticalSeries() function always returns true is stale markers are disabled. This prevents from storing the last response at scrapeWork.processScrapedData(). It looks like the reverted commit could also return back the issue #3660 Updates #5577
…markers … (#5577)" This reverts commit cfec258. Reason for revert: the original code already doesn't store the last scrape response when stale markers are disabled. The scrapeWork.areIdenticalSeries() function always returns true is stale markers are disabled. This prevents from storing the last response at scrapeWork.processScrapedData(). It looks like the reverted commit could also return back the issue #3660 Updates #5577
Describe the bug
When I use series limit to limit the series ingested into victoriametrics, the automatic metrics generated by vmagent shows that new series are being dropped (
scrape_series_limit_samples_dropped
), which is expected. But, checking the victoriametrics grafana dashboard, it shows a high churn rate, which suggests that even though vmagent shows the series as dropped, they are ingested into victoriametrics. I expect the samples shown by vmagent as dropped should not be ingested into victoriametrics.I used avalanche (https://github.com/prometheus-community/avalanche) to run this test.
To Reproduce
Start victoriametrics:
Start avalanche:
git clone https://github.com/prometheus-community/avalanche.git cd avalanche go run cmd/avalanche.go --metric-count=500 --series-count=30 --port=9101
Start vmagent with seriesLimitPerTarget setting:
scrape-config.yaml
./vmagent-prod -remoteWrite.url "http://localhost:8428/api/v1/write" -promscrape.config /tmp/scrape-config.yaml -promscrape.seriesLimitPerTarget 5000 -promscrape.streamParse=true
Version
Logs
Not relevant to this issue.
Screenshots
The automatic metrics generated by vmagent show that samples are being dropped (this is as expected because avalanche is generating completely new series every minute):
The churn rate panel in VictoriaMetrics cluster grafana dashboard is consistently high and new series over 24h is increasing:
Here, churn rate = 250 series/sec = 250x60 series/min = 15,000 series/min
The number of series generated by Avalanche for every scrape (based on the above configuration) is 500x30=15,000 samples. The default configuration of avalanche is to generate entirely new series every 60s. So, looking at the churn rate, none of the series is being dropped by vmagent.
I expect the churn rate to be close to 0 here since all the new metrics emitted by avalanche should be dropped by vmagent.
Used command-line flags
Mentioned under the "To Reproduce" section.
Additional information
No response
The text was updated successfully, but these errors were encountered: