Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement] cluster variable support #15

Closed
tesharp opened this issue Jul 7, 2022 · 29 comments · Fixed by #90
Closed

[enhancement] cluster variable support #15

tesharp opened this issue Jul 7, 2022 · 29 comments · Fixed by #90
Assignees
Labels
enhancement New feature or request released

Comments

@tesharp
Copy link

tesharp commented Jul 7, 2022

Thanks for very nice dashboards.

One thing missing is a variable "cluster" maybe. Having multiple clusters it is useful to limit scope to a single cluster. A multi-select variable accepting all and queries adding "cluster=~"$cluster".

@dotdc
Copy link
Owner

dotdc commented Jul 9, 2022

Hi @tesharp, you can achieve this by using the Datasource variable; just configure each of your clusters as datasources.

@dotdc dotdc self-assigned this Jul 9, 2022
@cebidhem
Copy link

Hi @dotdc , first of I really think this is great stuff. Came here from your Medium post, and I'm glad some of us show that FR people can do stuffs too :)

Anyway, even though you're right for Datasources, there is the use-case of people using a Prom federation, or who uses tools like Thanos (even Grafana Mimir but I haven't tried it) in order to achieve global monitoring of several clusters from the same point.
In the case of Thanos - for which I'm familiar with - we get all the data in the same "bucket", therefore our Grafanas usually point to 1 DS to viz a several clusters.

I was about to open an issue before finding this one, I'm keen to propose a PR adding a cluster var and let you decide whether or not you want to integrate it, wdyt ?

@dotdc
Copy link
Owner

dotdc commented Jul 12, 2022

Thank you @cebidhem!
I understand the need and also used the $cluster variable in the past.
The problem is that the cluster label is not available by default, I guess you have a custom rule that adds it.
If I add the cluster variable and label in my dashboards, they will not work without it, and will break the dashboards for most users.

I don't think you can create a PromQL query with an optional label, if any of you know a way of doing this, I'm really interested.
If you have other ideas to solve this, we can discuss it here.

@cebidhem
Copy link

It makes completely sense. I'll probably fork the project and add the variable in our project, at least for now.

@dotdc
Copy link
Owner

dotdc commented Jul 15, 2022

I'd be happy to include this feature if you make a tool or script that injects the necessary variables and labels to the dashboards.

This way, everyone will benefit from your work and it will also be easier for you to update.

@cebidhem
Copy link

We could use GitHub Actions to publish a set of new dashboards, maybe prefixed with multicluster-, with a cluster variable in the queries ?

That way, it would be possible to have both types of dashboards in the same repo, and even publish both version on Grafana.
But at the very least, they would be here to be consumed by the Grafana configuration as raw.

I've never really work with Actions as I'm mostly using GitLab, but I could give it a try.

I guess a quite simple working solution could be a shell script doing some sed.

@dotdc
Copy link
Owner

dotdc commented Jul 18, 2022

That was the idea, if you create a tool that can inject the labels & variables in the script, I will make a special release for them.

@dotdc dotdc added help wanted Extra attention is needed hacktoberfest Part of the Hacktoberfest labels Sep 28, 2022
@k1rk
Copy link
Contributor

k1rk commented Oct 3, 2022

@dotdc how about usage of jsonnet?
i know that https://github.com/grafana/grafonnet-lib doesn't support grafana 9. but i can try to build dashboard generation on bare jsonnet, without grafonnet

I can prepare changes if you are fine with the approach

@dotdc
Copy link
Owner

dotdc commented Oct 3, 2022

Hi @k1rk,

The only requirement I have is that the source must remain flat JSON Grafana dashboards.
If you can provide a script, in any language, that can generate a copy of the dashboards and inject the missing cluster label in them, it would be the perfect solution to me.

Let me know.

@dotdc dotdc removed the hacktoberfest Part of the Hacktoberfest label Nov 25, 2022
@dotdc dotdc changed the title Cluster variable [enhancement] cluster variable support Apr 28, 2023
@keith-e-munro
Copy link

Did anyone manage to script this? Would love to add the functionality for multi-cluster support.

Excellent dashboards btw, thanks!

@pingping95
Copy link

I have same issue, too..

@dotdc
Copy link
Owner

dotdc commented Jul 7, 2023

Hi @keith-e-munro,

I don't think anyone did it yet, but we can discuss the topic further if you're willing to do it.

I personally use one datasource per cluster, even when using Thanos, so I don't have to rely on the cluster variable.

@mihaico
Copy link

mihaico commented Aug 4, 2023

Thank you @cebidhem! I understand the need and also used the $cluster variable in the past. The problem is that the cluster label is not available by default, I guess you have a custom rule that adds it. If I add the cluster variable and label in my dashboards, they will not work without it, and will break the dashboards for most users.

I don't think you can create a PromQL query with an optional label, if any of you know a way of doing this, I'm really interested. If you have other ideas to solve this, we can discuss it here.

First of all, very nice set of dashboards 👏 .

Not sure if I have all the context, but non-existing labels should not break the query, at least not in recent versions of Prometheus e.g. > 2.45.0 (could not find whether this was a fix at some point); so adding a variable with default empty value and a label like cluster="$cluster" in the dashboard queries will work since the metric value is returned in case the label is not defined {cluster=""}

@dotdc
Copy link
Owner

dotdc commented Aug 4, 2023

This is really interesting @mihaico, just tested on 2.46.0 and it works!
I'm pretty sure it wasn't the case before, so I would be really interested to know when it changed, or if I just didn't see the elephant in the room 😅

We could pick this solution if the change is old enough, otherwise I would wait a little because 2.45.0 will probably be a bit too recent for most users.

@AndrisJrs
Copy link

AndrisJrs commented Sep 13, 2023

I am not sure whether this is related, but I am experiencing issue that dashboard picks up non-kubernetes nodes (in this case openwrt node-exporter). It also messes up metrics like CPU usage, averaging them together as it would be an k8s node.

It incorrectly shows other non-kubernetes node.
image

As I use kube-prometheus-stack I am adding the openwrt node-export with prom operator CRD. Could it be that it is picking it up as k8s node due it being defined as service endpoint?

kind: Service
apiVersion: v1
metadata:
  name: openwrt-prometheus-metrics
  labels:
    app: openwrt-prometheus-metrics
spec:
 type: ClusterIP
 ports:
 - name: metrics
   port: 9100
   targetPort: 9100
---
kind: Endpoints
apiVersion: v1
metadata:
 name: openwrt-prometheus-metrics
subsets:
 - addresses:
   - ip: 10.1.20.1
   ports: 
   - name: metrics
     port: 9100
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: openwrt-servicemonitor
spec:
  selector:
    matchLabels:
      app: openwrt-prometheus-metrics
  endpoints:
    - port: metrics
      interval: 30s
      path: /metrics
  namespaceSelector:
    any: true

@dotdc
Copy link
Owner

dotdc commented Sep 16, 2023

Hi @AndrisJrs,

There is no differentiating label for Kubernetes nodes here.
I would recommend running a dedicated Prometheus instance for your cluster.

In order to keep the issue on topic, please open a dedicated issue if you want to discuss it further.

@jkroepke
Copy link
Contributor

jkroepke commented Oct 14, 2023

image

@dotdc I retest this in Promtheus 2.0.0 (docker run --rm -ti -p9090:9090 prom/prometheus:v2.0.0) and cluster="" still works there. From my point of view, this should unlock the Feature Request

@dotdc
Copy link
Owner

dotdc commented Dec 15, 2023

Sorry for the delay, I know this has been asked for quite some time.

@jkroepke thank you for this input, I think it's safe enough to give it a shot.
I'll convert the global view and share it beginning of next week so you can test and let me know if it works on your various setups.

@dotdc
Copy link
Owner

dotdc commented Dec 18, 2023

Just added the cluster variable in k8s-views-global.json in this PR : #78

If anyone has time to try it, and let me know if it works on your side.

@jkroepke
Copy link
Contributor

I give some feedback on PR

@dotdc dotdc added enhancement New feature or request and removed help wanted Extra attention is needed labels Dec 18, 2023
@dotdc
Copy link
Owner

dotdc commented Dec 18, 2023

Thank you @jkroepke, just made the according change 👍
Let me know if you notice anything else.

@dotdc
Copy link
Owner

dotdc commented Dec 18, 2023

Just did the namespaces, nodes and pods views (#82).
I'll wait for more feedback before doing the same for the remaining dashboards.

@AndrisJrs
Copy link

AndrisJrs commented Dec 18, 2023

How to add additional label in Prometheus Operator? As "cluster" isn't default label.
image

@dotdc
Copy link
Owner

dotdc commented Dec 18, 2023

How to add additional label in Prometheus Operator? As "cluster" isn't default label. image

Hi @AndrisJrs, you can add this label using static_configs or relabel_configs, but only if you need it.
This addition was made for Thanos users (or similar) using a single datasource for multiple clusters (global queries).
All PromQL queries should work without this label, let me know if it's not the case on your setup.

@jkroepke
Copy link
Contributor

Hi @doc,

I looked into #82. Since multi-value is dropped, the query can be optimized to cluster="$cluster" instead cluster=~"$cluster". Not sure if it really has an impact.

@dotdc
Copy link
Owner

dotdc commented Dec 19, 2023

True, made the change in #84

@AndrisJrs
Copy link

Hi @dotdc, Thanks for the tip. It works fine without label. I wanted to see whether it will fix non-kubernetes node-exporters leaking into dashboard like I reported earlier in this thread.
Works perfectly after adding relabelings. Thank you 👍

@dotdc
Copy link
Owner

dotdc commented Jan 4, 2024

Just merged #90 with the missing dashboards. 🎉

Please open a new issue if you see any bugs related to this.

Thank you all for your ideas !

@dotdc
Copy link
Owner

dotdc commented Apr 25, 2024

🎉 This issue has been resolved in version 1.1.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

@dotdc dotdc added the released label Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request released
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants