Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upPrometheus scrap manager get stuck after deleting ServiceMonitor #4992
Comments
This comment has been minimized.
This comment has been minimized.
|
This is fixed by #4894 and will be available in the upcoming v2.6.0. |
simonpasquier
closed this
Dec 12, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
skbly7 commentedDec 12, 2018
Bug Report
What did you do?
Deleted a ServiceMonitor via
kubectl delete servicemonitors name-of-service-monitorWhat did you expect to see?
After the deletion is successful, it should trigger successful reload of configuration in Prometheus by scrape manager component
What did you see instead? Under which circumstances?
I tried to replicate multiple times and it seems to hit 9/10 times when the count of ServiceMonitor is approx 20 or higher.
This is followed by complete deadlock/breaking of scrape manager and relevant Prometheus APIs like
/service-discoveryEnvironment
NOTE: This was reproducible for me with any 20 different type of jobs and not they don't need to be necessarily like those shared below. Below config is from testing environment I create with same config x 20 times and was able to reproduce there as well.
For example after deleting a ServiceMonitor
websites-www-5