Connectivity issues seems to cause memory spike in Prometheus #13489

wiardvanrij · 2024-01-29T05:38:41Z

What did you do?

I'm running Prometheus with istio as a sidecar for proxy/mesh functionality. We potentially had a memory leak in istio, which is somewhat unrelated to Prometheus, but it did uncover some cascading issue when the sidecar goes towards 100% memory / OOM.

This causes connectivity issues for 'anything'. Meaning that scrapes will not resolve, the thanos-sidecar is unable to talk to Prometheus, queries do not hit Prometheus, etc. etc. Basically, both egress as ingress is gone. That is expected and "fine", however it directly shows an almost x2 increase in Prometheus memory usage, as seen in the following screenshot:

Now, I'm sorry I do not have profiles or any extra information, and I also don't find it such exiting Prometheus 'bug' (if it is anything), but I did figure to report it as it's somewhat unexpected to see.

What did you expect to see?

Prometheus not being able to scrape targets and having 'issues' in general to scrape stuff, but maintain memory usage on 'normal' levels.

What did you see instead? Under which circumstances?

Close to x2 increase in memory when there is connectivity issues due to sidecar proxy 'outage'.

I also want to mention that correlation is not causation, however it happened more than once (multi clusters and such) and that I did not dig really deep into this due to other priorities and the fact that we just need to fix the sidecar :)

System information

No response

Prometheus version

v2.48.1

Prometheus configuration file

No response

Alertmanager version

No response

Alertmanager configuration file

No response

Logs

Not much special to report, other than quite some
="http: superfluous response.WriteHeader call from github.com/opentracing-contrib/go-stdlib/nethttp.(*statusCodeTracker).WriteHeader (status-code-tracker.go:19" msg=) type of logs.

The text was updated successfully, but these errors were encountered:

roidelapluie · 2024-02-02T12:08:39Z

Thanks! Are you able to provide a memory trace (proof) when that happens?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Connectivity issues seems to cause memory spike in Prometheus #13489

Connectivity issues seems to cause memory spike in Prometheus #13489

wiardvanrij commented Jan 29, 2024

roidelapluie commented Feb 2, 2024

Connectivity issues seems to cause memory spike in Prometheus #13489

Connectivity issues seems to cause memory spike in Prometheus #13489

Comments

wiardvanrij commented Jan 29, 2024

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

System information

Prometheus version

Prometheus configuration file

Alertmanager version

Alertmanager configuration file

Logs

roidelapluie commented Feb 2, 2024