Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Prometheus Monitoring is showing "Page not found 404 Error" in cluster home #41562

Open
eroji opened this issue Apr 9, 2023 · 7 comments
Labels
kind/bug Issues that are defects reported by users or that we know have reached a real release

Comments

@eroji
Copy link

eroji commented Apr 9, 2023

Rancher Server Setup

  • Rancher version: 2.7.1
  • Installation option (Docker install/Helm Chart): Helm
    • If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc): RKE1
  • Proxy/Cert Details:

Information about the Cluster

  • Kubernetes version: 1.20.15
  • Cluster Type (Local/Downstream): Downstream
    • If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider): Custom

User Information

  • What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom)
    • If custom, define the set of permissions: Admin

Describe the bug

Cluster home with Monitoring installed is showing "Page not found 404 Error".

To Reproduce

Installed Prometheus Monitoring app from Apps > Charts > Monitoring. After everything is installed, navigate to the Cluster home. Scroll to the bottom and the Cluster Metrics, Kubernetes Component Metrics and Etcd Metrics should now be displayed. However, it is showing "Page not found 404 Error".

Result
No Grafana graphs are shown, only "Page not found 404 Error".

Expected Result

The respective Grafana graphs should be displayed

Screenshots

image

Additional context

@eroji eroji added the kind/bug Issues that are defects reported by users or that we know have reached a real release label Apr 9, 2023
@jcox10
Copy link

jcox10 commented Apr 10, 2023

Same on Rancher 2.6.11, monitoring 100.2.0+up40.1.2

@jcox10
Copy link

jcox10 commented Apr 10, 2023

Might be related to #43467

@lasseoe
Copy link

lasseoe commented Apr 11, 2023

Seeing the same issues on 2.7.1 after upgrading monitoring to 100.2.0+up40.1.2
The workaround in #43467 doesn't work for me. Clearing the browser cache doesn't help, have also tried deleting the Rancher pod that also didn't make a difference.

If I click on the Grafana link for i.e. Etcd Metrics the URL I end up with, which gives me a 404 is: https://rancher.bleh.blah:3000/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-grafana:80/proxy/d/rancher-etcd-nodes-1/rancher-etcd-nodes?orgId=1

However, if I go and view that particular dashboard within Grafana the URL is slightly different and works: https://rancher.bleh.blah:3000/k8s/clusters/local/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-grafana:80/proxy/d/rancher-etcd-nodes-1/rancher-etcd-nodes?orgId=1

Screenshot 2023-04-11 at 12 00 40

I'm on RKE2 1.24.12 and it's for the local cluster, no remote/managed clusters.

@lasseoe
Copy link

lasseoe commented Apr 11, 2023

Looks like this PR would fix it for me, rancher/prometheus-federator#65

@mantis-toboggan-md mantis-toboggan-md transferred this issue from rancher/rancher May 15, 2023
@mantis-toboggan-md mantis-toboggan-md transferred this issue from rancher/dashboard May 15, 2023
@nflynt
Copy link
Contributor

nflynt commented May 15, 2023

Did some digging on this with @mantis-toboggan-md and figured out that the requirement for the /k8s/clusters/local prefix has changed between Monitoring chart versions. This was related to some troubleshooting around the underlying proxy and nginx configuration, which has since been fixed.

In short:
For specific chart versions 100.2.0+up40.1.2 and 102.0.0+up40.1.2, links to Grafana should include the /k8s/clusters/local prefix
For any other chart version, including newer versions, links to Grafana should not include the /k8s/clusters/local prefix.

Both the link on Monitoring -> Grafana, and the embedded iframe views on the Cluster tab, should follow the same logic regarding the prefix. Right now these two locations appear to be using different rules to construct the URL, usually resulting in one or the other being broken for a given chart version.

@maxgio92
Copy link

maxgio92 commented Oct 2, 2023

Hello everyone, sharing my experience, it happens when for the "rancher-monitoring" App/Helm release (which is for 99% the kube-prometheus-stack), the Helm values:

  • global.cattle.clusterId
  • global.cattle.clusterName

are not set.

Those values are needed to build the Nginx rewrite rules correctly, to proxy the Grafana Kubernetes Service in the downstream clusters through the downstream cluster's API server proxy via the Rancher proxy.

When managing clusters the GitOps way Fleet provides integration with Rancher through cluster labels. Otherwise, you need to know in advance the Rancher-generated cluster ID, and declare it in your downstream cluster's configuration.

@AntoineNGUYEN31
Copy link

AntoineNGUYEN31 commented Oct 4, 2023

@maxgio92 you are right. Update the helm option & redepoy all pods works like a charm.
For those who use fleet.yaml to deploy here is the value to put:

global:
  cattle:
    clusterId: global.fleet.clusterLabels.management.cattle.io/cluster-name
    clusterName: global.fleet.clusterLabels.management.cattle.io/cluster-display-name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues that are defects reported by users or that we know have reached a real release
Projects
None yet
Development

No branches or pull requests

7 participants