Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric agent_component_controller_running_components when component fails #274

Open
sadovnikov opened this issue Jan 18, 2024 · 2 comments · May be fixed by grafana/agent#6385
Open

Metric agent_component_controller_running_components when component fails #274

sadovnikov opened this issue Jan 18, 2024 · 2 comments · May be fixed by grafana/agent#6385
Labels
bug Something isn't working

Comments

@sadovnikov
Copy link

What's wrong?

When the loki.source.kubernetes_events component fails to load and exits, the value of the agent_component_controller_running_components metric does not change. All GA components are reported as healthy.

When with the enclosed configuration, the Source:EventSource{Component:,Host:,} component fails to load the loki.source.kubernetes_events reports 6 "healthy" components - the same value as with successfully loaded components. The health_type is always "healthy".

The failure to load the component is reported only by this logline:

image

Steps to reproduce

The Kubernetes Events source fails to load only occasionally and only in overloaded clusters.
Maybe, to reproduce the problem with a dev version, the informerSyncTimeout can be set to a very low value

System information

No response

Software version

v0.37.2

Configuration

Full `config.river`


    logging {
      level  = "info"
      format = "json"
    }

    otelcol.receiver.otlp "otlp" {
      http {}

      output {
        metrics= [otelcol.exporter.prometheus.prom.input]
      }
    }

    otelcol.exporter.prometheus "prom" {
      forward_to  = [ prometheus.remote_write.default.receiver ]
    }

    prometheus.remote_write "default" {
      endpoint {
        url = "http://prometheus-server.monitoring-system.svc.cluster.local/api/v1/write"
      }
    }

    loki.write "obs" {
      external_labels = { cluster = "staging" }
      endpoint {
        url = "https://loki.platform-staging.internal.xxx.yyy/loki/api/v1/push"
      }
    }

    loki.relabel "drop_instance" {
      forward_to = [loki.write.obs.receiver]
      rule {
        action = "labeldrop"
        regex  = "instance"
      }
    }

    loki.source.kubernetes_events "default" {
      forward_to = [loki.relabel.drop_instance.receiver]
      job_name = "kubernetes_events"
      log_format = "json"
    }


### Logs

_No response_
Copy link
Contributor

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it.
If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue.
The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!

@rfratto
Copy link
Member

rfratto commented Apr 11, 2024

Hi there 👋

On April 9, 2024, Grafana Labs announced Grafana Alloy, the spirital successor to Grafana Agent and the final form of Grafana Agent flow mode. As a result, Grafana Agent has been deprecated and will only be receiving bug and security fixes until its end-of-life around November 1, 2025.

To make things easier for maintainers, we're in the process of migrating all issues tagged variant/flow to the Grafana Alloy repository to have a single home for tracking issues. This issue is likely something we'll want to address in both Grafana Alloy and Grafana Agent, so just because it's being moved doesn't mean we won't address the issue in Grafana Agent :)

@rfratto rfratto transferred this issue from grafana/agent Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants