New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS] Limit the predefined Container Insights metrics that needs to be ingested in CloudWatch #672
Comments
I am glad this was written. I was about to write the same suggestion. Running cloud insights in my clusters is almost 40% the cost of some of my clusters. A cost optimized implementation or a cost savings would make using this service much more affordable. The majority of the cost are the custom metrics x number of nodes x number of k8 resources running (pods, ns, services). The metric selection to turn off is a great idea. Possibly allowing to limit the resources you want insights is better. Such as turn off namespace or services to just focus on the pod and node metrics. Or simply give a bigger volume discount so we get to keep it all. The last cloudwatch price decrease was on Nov 21, 2016 . How it is currently priced: There is a predefined number of metrics reported for every cluster, node, pod, and service. Every cluster reports 24 metrics; every node reports 8 metrics; every pod reports 9 metrics; and every service reports 6 metrics. CloudWatch metrics are aggregated by pod, service, and namespace using their name. Increasing the count of running instances will not impact the count of CloudWatch metrics generated. All CloudWatch metrics are prorated on an hourly basis. This example assumes that data points are reported for the entire month. Monthly number of CloudWatch metrics per cluster |
+1 |
More than having to limit the number of metrics collected, I wish we could see AWS Container Insights metrics become AWS provided metrics free of charge. The current pricing is very harsh, and pushes people to seek for alternative monitoring solutions in order to save on concurrent costs. |
Any update on this? Just being able to specify which metrics to push in cw agent config map would be perfect |
This is a critical feature for my Org where we have 2K+ pods that includes dev and prod. We don't want to track metrics on all namespaces/pods and at same time not all metrics tracking is needed. Having this feature will help in cost optimisation. Hoping to see this feature soon. |
There are related issue and PRs in cwagent repo
Filtering which metrics to send is a bit more complex, I am not sure if we will implement it and include it in next release. |
As these custom metrics are created by AWS documentation AND utilized by AWS Container Insights Dashboard, these metrics should not be under "custom metrics", but under AWS metrics, and they should not generate costs for customers in this indirect manner. |
Just like @hhamalai stated above. One chooses EKS and CW because of the mantra AWS tends to repeat "Let AWS handle the heavy lifting". In this case it bites you badly. These costs are unforeseen and cause unnecessary wtf-moments when checking billing. |
Paste from aws/amazon-cloudwatch-agent#103 (comment) It's possible to do that with opentelemetry collector https://aws-otel.github.io/docs/getting-started/container-insights/eks-infra#advanced-usage
https://aws-otel.github.io/docs/getting-started/container-insights/eks-infra#configure-metrics-sent-by-cloudwatch-embedded-metric-format-exporter Mentions how to remove pod network metrics etc. |
There is a new blog on using ADOT to customize metrics https://aws.amazon.com/blogs/containers/cost-savings-by-customizing-metrics-sent-by-container-insights-in-amazon-eks/ |
I installed ADOT and it worked. I spent quite a lot time to find a way to filter by namespace which the method was not introduced in the AWS blog. Finally I found that it can be done by using filter process snippet I tried to filter by namespace: filter/include:
metrics:
include:
match_type: regexp
metric_names:
- ^pod_.*
- ^service_.*
- namespace_number_of_running_pods
resource_attributes:
- Key: Namespace
- Value: (include-namespace-1|include-namespace-2)
filter/exclude:
metrics:
include:
match_type: regexp
resource_attributes:
- Key: Namespace
- Value: (exclude-namespace-1|exclude-namespace-2) And if you added |
Agreeing with previous comments, the current pricing for monitoring EKS using CW is rocket high. Even with a small cluster (~30 pods), it costs more than the EC2s we launched. Any progress to support this feature or at least make the pricing lower than today? We are actively looking for alternatives these days. |
With |
Is there any update on this issue? |
High cardinality should not be the default! @GreasyAvocado I believe the answer is to switch to ADOT, where you have much greater control over the filtering to include/exclude metrics. Dumping some linksInstall Overview: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/deploy-container-insights-EKS.html Install Guide: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-EKS-otel.html Install Tutorial - customising metrics published (supports logs too, but you don’t need to check if your cluster is tanking due to resource utilization): https://aws-otel.github.io/docs/getting-started/container-insights/eks-infra Helm Chart: https://github.com/aws-observability/aws-otel-helm-charts/tree/main/charts/adot-exporter-for-eks-on-ec2 |
That's indeed what I ended up doing. I actually went with https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/extension/observer/ecsobserver/README.md, instead of ADOT, but it's not very different. |
I believe this would still be a very useful feature with enhanced Observability. Are there any updates? |
Which service(s) is this request for?
EKS
Tell us about your request
CloudWatch Container Insights for EKS should have an option for excluding some of the metrics which are not required in order to save the cost of custom metrics.
Configurations
Ability to select which metrics to exclude such as pod_network_rx_bytes, pod_network_tx_bytes, etc.
The text was updated successfully, but these errors were encountered: