Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metric_selector semantics to .logs.metrics_collected.kubernetes #401

Closed
ecerulm opened this issue Mar 9, 2022 · 6 comments
Closed
Labels

Comments

@ecerulm
Copy link
Contributor

ecerulm commented Mar 9, 2022

The kubernetes configuration brings metrics like pod_memory_utilization for all pods in a cluster.

I have many pods that are ephemeral / short lived.

I use airflow in kubernets so there are hundreds of those airflow task pods per day, and currently the number CloudWatch metrics that log.metrics_collected.kubernetes is really huge, and most of those metrics are not even useful container just 1 data point because the pods don't live enough to produce more that one datapoint. Currently I have ~20000 metrics created this way. I would like collect metrics only for my long-lived pods while excluding my short lived pods (that I can identify by k8s namespace or k8s labels)

I think some kind of mechanism to filter / drop / exclude pods would be beneficial. In particular, the following use cases might be of interest to most:

  • exclude all pods in a particular kubernetes namespaces
  • exclude pods with certain k8s labels, for example in my case I could exclude all pods with label kubernetes_executor: true , or if any of the following k8s labels are present airflow_worker: xx, airflow_version: xx, dag_id: xxx, execution_date: xx, task_id: xx
  • or in a more general way it could be implemented following the same pattern as the .logs.metrics_collected.prometheus.emf_processor.metric_declaration with some source_labels (maybe mimicking the __meta_kubernetes_pod_* of the kubernetes_sd_config)

Example of an hypothetical config that will exclude pod based on pod metadata:

{
  "logs": {
    "metrics_collected": {
      "kubernetes": {
        "cluster_name": "${cluster_name}",
        "metrics_collection_interval": 60,
        "emf_processor": {
            "metric_declaration": [
                {
                    "source_labels": ["__meta_kubernetes_pod_label_task_id"],
                    "label_matcher ": "^$" // representing that the label does not exist, maybe? 
                }
            ]
        }
      }
    },
    "force_flush_interval": 5
  }
}
@ecerulm
Copy link
Contributor Author

ecerulm commented Mar 10, 2022

While I was digging in the source code trying to find a way to exclude my short lived pods from CW metrics I found that amazon-cloudwatch-agent will look for a kubernetes annotation called aws.amazon.com/cloudwatch-agent-ignore:

ignoreAnnotation = "aws.amazon.com/cloudwatch-agent-ignore"

I will try to use this k8s annotation but I still think it would be good to have a way to ignore that the cloudwatch-agent administrator can control (as the kubernetes annotation is controlled by whoever deploys).

Nonetheless I think the annotation cloudwatch-agent-ignore is not very well documented / visible (If you google for it you get 3 results and none of them lead to anything useful). I don't know what can be done to improve that but at least I hope when people google for it will land here.

@CaiYueTing
Copy link

CaiYueTing commented Mar 15, 2022

Agreeing @ecerulm, the cost was so high with the k8s cluster.
I think that is friendly for the user if the metrics_collected config support like the filter/exclude/include sample

@github-actions
Copy link
Contributor

This issue was marked stale due to lack of activity.

@github-actions github-actions bot added the Stale label Jun 14, 2022
@github-actions
Copy link
Contributor

Closing this because it has stalled. Feel free to reopen if this issue is still relevant, or to ping the collaborator who labeled it stalled if you have any questions.

sethAmazon pushed a commit to sethAmazon/amazon-cloudwatch-agent that referenced this issue Aug 21, 2023
@ecerulm
Copy link
Contributor Author

ecerulm commented Mar 13, 2024

@SaxyPandaBear , would it be possible to reopen this issue?

@tgraupne
Copy link

tgraupne commented Apr 8, 2024

I would like to see this issues reopened as well.
Especially with enhanced Observability, reducing/filtering metrics seems essential to me.

Without it, the CloudWatch Observability EKS add-on creates so many unuseful and expensive metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants