[BUG] rancher monitoring chart lacks Network Policy permission to collect metrics from GUI's ingress-nginx pods #45603
Labels
kind/bug
Issues that are defects reported by users or that we know have reached a real release
priority/2
regression
status/to-reproduce
team/observability&backup
the team that is responsible for monitoring/logging and BRO
Rancher Server Setup
Information about the Cluster
User Information
Admin
Reopening the issue from rancher/rke2#6000 per suggestion from @alexandreLamarre
Node(s) CPU architecture, OS, and Version:
Linux my_secret_hostname 5.14.0-427.16.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Apr 26 18:16:09 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
RHEL 9:
Cluster Configuration:
Describe the bug:
Nginx Ingress metrics cannot be collected anymore. Per discussion in rancher/rke2#6000 I have assumed that it is rke2 related, now seems to be
rancher-monitoring
helm chart related. The chart needs either some adjustment or additional network policy which would allow prometheus to collect nginx-pod metrics.Steps To Reproduce:
Following RKE2 config is being used to bootstap first control node:
rancher-monitoring
helm chart (incl. CRDs) - we use currentlyversion: 103.1.0+up45.31.1
but it should not matter that muchServiceMonitor
will be installed:serviceMonitor/kube-system/rancher-monitoring-ingress-nginx
is DOWN, withGet "http://ip:10254/metrics": context deadline exceeded
Once RKE2 default networking policy is adjusted,
default-network-ingress-policy
, it works again:We are not using anything related to calico's
global deny
or similar. All network policies are default, not customized by us in any way.Expected behavior:
Metrics about
rke2-ingress-nginx-controller
s can be collected/shown in grafana.Actual behavior:
Metrics cannot be collected due to some Network Policy missing
metrics
port for nginx ingress pods . (This comes from aService
object calledpushprox-ingress-nginx-client
in kube-system NS.The text was updated successfully, but these errors were encountered: