You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running Prometheus with istio as a sidecar for proxy/mesh functionality. We potentially had a memory leak in istio, which is somewhat unrelated to Prometheus, but it did uncover some cascading issue when the sidecar goes towards 100% memory / OOM.
This causes connectivity issues for 'anything'. Meaning that scrapes will not resolve, the thanos-sidecar is unable to talk to Prometheus, queries do not hit Prometheus, etc. etc. Basically, both egress as ingress is gone. That is expected and "fine", however it directly shows an almost x2 increase in Prometheus memory usage, as seen in the following screenshot:
Now, I'm sorry I do not have profiles or any extra information, and I also don't find it such exiting Prometheus 'bug' (if it is anything), but I did figure to report it as it's somewhat unexpected to see.
What did you expect to see?
Prometheus not being able to scrape targets and having 'issues' in general to scrape stuff, but maintain memory usage on 'normal' levels.
What did you see instead? Under which circumstances?
Close to x2 increase in memory when there is connectivity issues due to sidecar proxy 'outage'.
I also want to mention that correlation is not causation, however it happened more than once (multi clusters and such) and that I did not dig really deep into this due to other priorities and the fact that we just need to fix the sidecar :)
System information
No response
Prometheus version
v2.48.1
Prometheus configuration file
No response
Alertmanager version
No response
Alertmanager configuration file
No response
Logs
Not much special to report, other than quite some
="http: superfluous response.WriteHeader call from github.com/opentracing-contrib/go-stdlib/nethttp.(*statusCodeTracker).WriteHeader (status-code-tracker.go:19" msg=) type of logs.
The text was updated successfully, but these errors were encountered:
What did you do?
I'm running Prometheus with istio as a sidecar for proxy/mesh functionality. We potentially had a memory leak in istio, which is somewhat unrelated to Prometheus, but it did uncover some cascading issue when the sidecar goes towards 100% memory / OOM.
This causes connectivity issues for 'anything'. Meaning that scrapes will not resolve, the thanos-sidecar is unable to talk to Prometheus, queries do not hit Prometheus, etc. etc. Basically, both egress as ingress is gone. That is expected and "fine", however it directly shows an almost x2 increase in Prometheus memory usage, as seen in the following screenshot:
Now, I'm sorry I do not have profiles or any extra information, and I also don't find it such exiting Prometheus 'bug' (if it is anything), but I did figure to report it as it's somewhat unexpected to see.
What did you expect to see?
Prometheus not being able to scrape targets and having 'issues' in general to scrape stuff, but maintain memory usage on 'normal' levels.
What did you see instead? Under which circumstances?
Close to x2 increase in memory when there is connectivity issues due to sidecar proxy 'outage'.
I also want to mention that correlation is not causation, however it happened more than once (multi clusters and such) and that I did not dig really deep into this due to other priorities and the fact that we just need to fix the sidecar :)
System information
No response
Prometheus version
Prometheus configuration file
No response
Alertmanager version
No response
Alertmanager configuration file
No response
Logs
The text was updated successfully, but these errors were encountered: