Circuit breaker for receiver/prometheus and exporter/prometheusremotewrite #26996
Labels
closed as inactive
enhancement
New feature or request
exporter/prometheusremotewrite
receiver/prometheus
Prometheus receiver
Stale
waiting for author
Component(s)
cmd/otelcontribcol, exporter/prometheusremotewrite, receiver/prometheus
Is your feature request related to a problem? Please describe.
We use receiver/prometheus to scrape prometheus metrics and exporter/prometheusremotewrite to write those metrics using otelcol to Prometheus. Sometimes the applications don't expose metrics to harvest and sometimes when prometheus server or network issues cause otel-collector queues to get filled and get OOM killed.
Describe the solution you'd like
I would like to propose a circuit breaker implemented at receiver and exporter level so that when an endpoint is unresponsive, the circuit breaker pattern can help prevent cascading failures, while the services are trying to recover. It will handle errors gracefully, reduce application downtimes and help the services to recover more efficiently. State changes of the circuit breaker can be used for error monitoring.
Describe alternatives you've considered
A retry option with back-off can work, but circuit breaker will be much more efficient.
Additional context
No response
The text was updated successfully, but these errors were encountered: