Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upConsider per-scrape sample limit as abuse/overload mitigation #2137
Comments
brian-brazil
added
the
kind/enhancement
label
Oct 30, 2016
beorn7
closed this
in
3044828
Jan 8, 2017
julianvmodesto
referenced this issue
Oct 11, 2017
Open
Consider setting sample_limit to mitigate overload #689
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 24, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
lock
bot
locked and limited conversation to collaborators
Mar 24, 2019
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
brian-brazil commentedOct 30, 2016
Currently if a client does something silly and starts returning large numbers of samples Prometheus will take it, and will ultimately fall over or get throttled if the throughput is too high.
While we're unlikely to ever fully solve this, and I think it'd be unwise to go down the rabbit hole of trying to get it perfect I think there are simple things that are practical to do.
I propose that we add a per-scrape_config option to limit the number of samples returned by individual targets in that scrape_config. This would not be relabelable, and on a target exceeding this none of the scraped samples would be ingested (to avoid weirdness from partial ingestion) and
upwould be set to 0.This would be a very blunt tool, but should catch the worst problems caused by a sudden increase in time series count. The disadvantage is that the user has to maintain the limit, but I think some level of additional administration is inevitable if a user chooses to use features in this area.