Skip to content

Remove old ingress-rules metrics for prometheus scraping #11047

Open
@SilentEntity

Description

@SilentEntity

What happened:

Once you update the ingress rule. The Ingress controller is still providing metrics for old rules (plus new rules), which increases cardinality and generates not-useful (dumb) data (for old removed rules) while Prometheus scrapes on the pod.

What you expected to happen:

Once the rules are updated or removed, the metrics from the old data should be removed, which reduces the cardinality and avoids providing not-useful data (for old removed/updated rules).

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

Kubernetes version (use kubectl version): Not relevant

Environment:

  • Cloud provider or hardware configuration:

  • OS (e.g. from /etc/os-release): not relevant

  • Kernel (e.g. uname -a): not relevant

  • Install tools: EKS, AKS and bare metal

    • Please mention how/where was the cluster created like kubeadm/kops/minikube/kind etc.
  • Basic cluster related info:

    • kubectl version
    • kubectl get nodes -o wide
  • How was the ingress-nginx-controller installed:

    • If helm was used then please show output of helm ls -A | grep -i ingress
    • If helm was used then please show output of helm -n <ingresscontrollernamespace> get values <helmreleasename>
    • If helm was not used, then copy/paste the complete precise command used to install the controller, along with the flags and options used
    • if you have more than one instance of the ingress-nginx-controller installed in the same cluster, please provide details for all the instances

How to reproduce this issue:

Add 100 rules, update the same rule, or reduce them to 10. The Ingress controller will provide the metrics data for old and new rules.

Increase in cardinality:

cat metrics | grep -v "#" |cut -d "{" -f1  | sort | uniq -c | sort -rn | head -n40
3048 nginx_ingress_controller_request_duration_seconds_bucket
2988 nginx_ingress_controller_response_duration_seconds_bucket
2988 nginx_ingress_controller_connect_duration_seconds_bucket
2820 nginx_ingress_controller_header_duration_seconds_bucket
2794 nginx_ingress_controller_response_size_bucket
2794 nginx_ingress_controller_request_size_bucket
2032 nginx_ingress_controller_bytes_sent_bucket
 254 nginx_ingress_controller_response_size_sum
 254 nginx_ingress_controller_response_size_count
 254 nginx_ingress_controller_requests
 254 nginx_ingress_controller_request_size_sum
 254 nginx_ingress_controller_request_size_count
 254 nginx_ingress_controller_request_duration_seconds_sum
 254 nginx_ingress_controller_request_duration_seconds_count
 254 nginx_ingress_controller_bytes_sent_sum
 254 nginx_ingress_controller_bytes_sent_count
 249 nginx_ingress_controller_response_duration_seconds_sum
 249 nginx_ingress_controller_response_duration_seconds_count
 249 nginx_ingress_controller_connect_duration_seconds_sum
 249 nginx_ingress_controller_connect_duration_seconds_count
 235 nginx_ingress_controller_header_duration_seconds_sum
 235 nginx_ingress_controller_header_duration_seconds_count

After you restart the pod:

cat metrics | grep -v "#" |cut -d "{" -f1  | sort | uniq -c | sort -rn | head -n40
 288 nginx_ingress_controller_response_duration_seconds_bucket
 288 nginx_ingress_controller_request_duration_seconds_bucket
 288 nginx_ingress_controller_header_duration_seconds_bucket
 288 nginx_ingress_controller_connect_duration_seconds_bucket
 264 nginx_ingress_controller_response_size_bucket
 264 nginx_ingress_controller_request_size_bucket
 192 nginx_ingress_controller_bytes_sent_bucket
  24 nginx_ingress_controller_response_size_sum
  24 nginx_ingress_controller_response_size_count
  24 nginx_ingress_controller_response_duration_seconds_sum
  24 nginx_ingress_controller_response_duration_seconds_count
  24 nginx_ingress_controller_requests
  24 nginx_ingress_controller_request_size_sum
  24 nginx_ingress_controller_request_size_count
  24 nginx_ingress_controller_request_duration_seconds_sum
  24 nginx_ingress_controller_request_duration_seconds_count
  24 nginx_ingress_controller_header_duration_seconds_sum
  24 nginx_ingress_controller_header_duration_seconds_count
  24 nginx_ingress_controller_connect_duration_seconds_sum
  24 nginx_ingress_controller_connect_duration_seconds_count
  24 nginx_ingress_controller_bytes_sent_sum
  24 nginx_ingress_controller_bytes_sent_count
  21 nginx_ingress_controller_ingress_upstream_latency_seconds
  19 nginx_ingress_controller_orphan_ingress
   7 nginx_ingress_controller_ingress_upstream_latency_seconds_sum
   7 nginx_ingress_controller_ingress_upstream_latency_seconds_count

Anything else we need to know:

Metadata

Metadata

Assignees

Labels

help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.needs-kindIndicates a PR lacks a `kind/foo` label and requires one.needs-priorityneeds-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions