Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] ClusterRole needs ability to view namespace in order to pick up dashboards in other namespace #1010

Closed
Yaytay opened this issue Apr 20, 2023 · 8 comments
Labels
bug Something isn't working needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one. v4 Major version 4

Comments

@Yaytay
Copy link

Yaytay commented Apr 20, 2023

Describe the bug
ClusterRole needs ability to view namespace in order to pick up dashboards in other namespace

My aim is to have the operator search for dashboards in a subset of all namespaces, where that subset isn't known in advance.
I think the only way to do this (on v4, at least) is with the dashboardNamespaceSelector and dashboardLabelSelectors and they appear to need rights on the namespace resources.

Version
docker.io/bitnami/grafana-operator:4.10.0-debian-11-r5

To Reproduce

  1. Deploy grafana operator using bitnami and argocd with:
    targetRevision: "2.7.25"
    repoURL: https://charts.bitnami.com/bitnami
    helm:
      values: |
        operator:
          watchNamespace: '""'
          scanAllNamespaces: true          
        grafana:
          config:
            users:
              default_theme: dark
            server:
              root_url: https://grafana.localtest.me          
            security:
              admin_user: admin
              admin_password: T0p-Secret
          dashboardNamespaceSelector:
            matchLabels:
              partition: new-hope
          dashboardLabelSelectors:
          - matchLabels:
              app: grafana
  1. Create namespace that will contain dashboard:
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instance":"new-hope-mysql","partition":"new-hope"},"name":"new-hope-mysql"}}
  creationTimestamp: "2023-04-20T13:37:56Z"
  labels:
    app.kubernetes.io/instance: new-hope-mysql
    kubernetes.io/metadata.name: new-hope-mysql
    partition: new-hope
  name: new-hope-mysql
  resourceVersion: "4460"
  uid: c1b17c04-3f7d-42c5-a3b1-55908e1e0b45
  1. Deploy dashboard to the new namespace:
apiVersion: integreatly.org/v1alpha1
kind: GrafanaDashboard
metadata:
  labels:
    app: grafana
    app.kubernetes.io/instance: new-hope-mysql
  name: mysql-dashboard
  namespace: new-hope-mysql
spec:
  json: |
...

The operator logs will display:

W0420 14:10:54.864851       1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.3/tools/cache/reflector.go:167: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:grafana:telemetry-grafana-operator" cannot list resource "namespaces" in API group "" at the cluster scope
E0420 14:10:54.864899       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.24.3/tools/cache/reflector.go:167: Failed to watch *v1.Namespace: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:grafana:telemetry-grafana-operator" cannot list resource "namespaces" in API group "" at the cluster scope

And grafana never picks up the dashboard.

The bitnami chart has created a ClusterRole that matches the one documented in this repo:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/instance: telemetry
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: grafana-operator
    helm.sh/chart: grafana-operator-2.7.25
  name: telemetry-grafana-operator
rules:
- apiGroups:
  - ""
  resources:
  - configmaps
  - persistentvolumeclaims
  - secrets
  - serviceaccounts
  - services
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - create
  - get
  - list
  - patch
  - watch
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - apps
  - extensions
  resources:
  - deployments
  - deployments/finalizers
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - integreatly.org
  resources:
  - grafanadashboards
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - integreatly.org
  resources:
  - grafanadashboards/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - integreatly.org
  resources:
  - grafanadatasources
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - integreatly.org
  resources:
  - grafanadatasources/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - integreatly.org
  resources:
  - grafanafolders
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - integreatly.org
  resources:
  - grafanafolders/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - integreatly.org
  resources:
  - grafananotificationchannels
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - integreatly.org
  resources:
  - grafananotificationchannels/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - integreatly.org
  resources:
  - grafanas
  - grafanas/finalizers
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - integreatly.org
  resources:
  - grafanas/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - route.openshift.io
  resources:
  - routes
  - routes/custom-host
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- nonResourceURLs:
  - /metrics
  verbs:
  - get

If I use "kubectl edit" to change the ClusterRole by inserting "- namespaces" under "- pods" the error stops appearing in the logs and my dashboard appears.

Expected behavior
The dashboard should appear in grafana.

Suspect component/Location where the bug might be occurring
Either I've screwed up my setup, or the v4 docs for ClusterRoles should be updated to reflect the need for namespaces access in this circumstance.

Runtime (please complete the following information):

  • OS: Windows 11, WSL
  • Grafana Operator Version: 4.10.0
  • Environment: Docker Desktop 4.18.0
  • Deployment type: deployed via argocd

Additional context
There was a PR for the bitnami chart that was blocked because the documentation in this repo says it isn't necessary.
I'd like to have a documented way to achieve my aims, whether that's by changing the ClusterRole or changing the operator config.

If you can confirm the rights required on the namespaces (I've tested with get, list & watch, but don't know that it needs all of them) and the circumstances under which this is required (I suspect it's when dashboardNamespaceSelector are used, but I don't know) I'd be happy to submit a PR for the docs.

@Yaytay Yaytay added bug Something isn't working needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 20, 2023
@NissesSenap NissesSenap added the v4 Major version 4 label Apr 22, 2023
@NissesSenap
Copy link
Collaborator

Hi @Yaytay , to be honest, I don't understand the issue.
But I have never used the bitnami helm chart (we don't maintain it) so i have no idea what kind of differences they might have done.

But at my old company I used to use the scan-all function and we didn't add any list namespace rbac rules. This since the operator on performs get on the grafanadashboard resource. As long as you have added the RBAC setting as described in https://github.com/grafana-operator/grafana-operator/tree/955daa6abb9b6efe85d921a783eeffd51853fae4/deploy/cluster_roles it should be okay.

In short, create a RBAC rules that gives the opreator access to read grafanadashboard in your entire cluster.

@Yaytay
Copy link
Author

Yaytay commented Apr 29, 2023

Hi @NissesSenap ,
I think the problem is that I don't want to pick up dashboards from the whole cluster, just from namespaces matching the selector (via dashboardNamespaceSelector) and I think that requires namespace access in the RBAC.
Would it help if I worked out what the result of the dashboardNamespaceSelector value is for the operator deployment?

@Yaytay
Copy link
Author

Yaytay commented Apr 30, 2023

This issue is discussing the same thing: #700
@pb82's comment was "We'll likely add those permissions back to the example cluster role", but I don't think that happened.

The helm chart simply sets the dashboardNamespaceSelector on the Grafana resource.

@Yaytay
Copy link
Author

Yaytay commented May 10, 2023

Ignoring the bitnami helm chart, this is the Grafana resource that is created and that does not work without namespace access:

apiVersion: v1
items:
- apiVersion: integreatly.org/v1alpha1
  kind: Grafana
  metadata:
    creationTimestamp: "2023-05-10T10:37:21Z"
    generation: 1
    labels:
      app.kubernetes.io/instance: telemetry
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: grafana-operator
      helm.sh/chart: grafana-operator-2.8.2
    name: telemetry-grafana-operator-grafana
    namespace: grafana
    resourceVersion: "9408"
    uid: bdb3230d-bbc1-46ea-a64b-fa86a4db5ecf
  spec:
    baseImage: docker.io/bitnami/grafana:9.5.1-debian-11-r4
    client:
      preferService: true
      timeout: 5
    config:
      alerting:
        enabled: false
      analytics:
        check_for_updates: false
        reporting_enabled: false
      log:
        level: warn
        mode: console
      security:
        admin_password: T0p-Secret
        admin_user: admin
        disable_gravatar: false
      server:
        root_url: https://grafana.localtest.me
      users:
        default_theme: dark
    dashboardLabelSelector:
    - matchLabels:
        app: grafana
    dashboardNamespaceSelector:
      matchLabels:
        partition: new-hope
    deployment:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/component: grafana
                  app.kubernetes.io/instance: telemetry
                  app.kubernetes.io/name: grafana-operator
              topologyKey: kubernetes.io/hostname
            weight: 1
      containerSecurityContext:
        allowPrivilegeEscalation: false
        privileged: false
        runAsGroup: 0
        runAsNonRoot: true
        runAsUser: 1001
      labels:
        app.kubernetes.io/instance: telemetry
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/name: grafana-operator
        helm.sh/chart: grafana-operator-2.8.2
      replicas: 1
      securityContext:
        fsGroup: 1001
        runAsGroup: 0
        runAsNonRoot: true
        runAsUser: 1001
        supplementalGroups: []
      skipCreateAdminAccount: false
    ingress:
      enabled: true
      hostname: grafana.localtest.me
      ingressClassName: nginx
      path: /
      pathType: Prefix
      tlsEnabled: true
      tlsSecretName: grafana.local-tls
    jsonnet:
      libraryLabelSelector:
        matchLabels:
          app.kubernetes.io/instance: telemetry
    livenessProbeSpec:
      failureThreshold: 6
      initialDelaySeconds: 120
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 5
    readinessProbeSpec:
      failureThreshold: 6
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 5
    resources:
      limits: {}
      requests: {}
    service:
      labels:
        app.kubernetes.io/instance: telemetry
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/name: grafana-operator
        helm.sh/chart: grafana-operator-2.8.2
      type: ClusterIP
  status:
    message: success
    phase: reconciling
    previousServiceName: grafana-service
kind: List
metadata:
  resourceVersion: ""

@NissesSenap
Copy link
Collaborator

So I have managed to reproduce your issue.

I installed the operator doing kubectl apply -k deploy/manifests/v4.10.0/ from the v4 branch and then I manually added --scan-all to the operator (too lazy to fix it through kustomize during my test).

W0511 06:36:54.127339       1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.3/tools/cache/reflector.go:167: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:system:controller-manager" cannot list resource "namespaces" in API group "" at the cluster scope
E0511 06:36:54.127398       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.24.3/tools/cache/reflector.go:167: Failed to watch *v1.Namespace: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:system:controller-manager" cannot list resource "namespaces" in API group "" at the cluster scope

My yaml looks like this.

apiVersion: v1
kind: Namespace
metadata:
  creationTimestamp: "2023-05-11T06:06:48Z"
  labels:
    kubernetes.io/metadata.name: system
  name: system
----
apiVersion: integreatly.org/v1alpha1
kind: Grafana
metadata:
  labels:
    app.kubernetes.io/instance: telemetry
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: grafana-operator
    helm.sh/chart: grafana-operator-2.8.2
  name: telemetry-grafana-operator-grafana
  namespace: system
spec:
  baseImage: docker.io/bitnami/grafana:9.5.1-debian-11-r4
  client:
    preferService: true
    timeout: 5
  config:
    alerting:
      enabled: false
    analytics:
      check_for_updates: false
      reporting_enabled: false
    log:
      level: warn
      mode: console
    security:
      admin_password: admin
      admin_user: admin
      disable_gravatar: false
    server:
      root_url: https://grafana.localtest.me
    users:
      default_theme: dark
  dashboardLabelSelector:
    - matchLabels:
        app: grafana
  dashboardNamespaceSelector:
    matchLabels:
      partition: new-hope
  deployment:
    affinity:
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/component: grafana
                  app.kubernetes.io/instance: telemetry
                  app.kubernetes.io/name: grafana-operator
              topologyKey: kubernetes.io/hostname
            weight: 1
    containerSecurityContext:
      allowPrivilegeEscalation: false
      privileged: false
      runAsGroup: 0
      runAsNonRoot: true
      runAsUser: 1001
    labels:
      app.kubernetes.io/instance: telemetry
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: grafana-operator
      helm.sh/chart: grafana-operator-2.8.2
    replicas: 1
    securityContext:
      fsGroup: 1001
      runAsGroup: 0
      runAsNonRoot: true
      runAsUser: 1001
      supplementalGroups: []
    skipCreateAdminAccount: false
  ingress:
    enabled: false
    hostname: grafana.localtest.me
    ingressClassName: nginx
    path: /
    pathType: Prefix
    tlsEnabled: true
    tlsSecretName: grafana.local-tls
  jsonnet:
    libraryLabelSelector:
      matchLabels:
        app.kubernetes.io/instance: telemetry
  resources:
    limits: {}
    requests: {}
  service:
    labels:
      app.kubernetes.io/instance: telemetry
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: grafana-operator
      helm.sh/chart: grafana-operator-2.8.2
    type: ClusterIP
---
apiVersion: integreatly.org/v1alpha1
kind: GrafanaDashboard
metadata:
  name: grafana-dashboard-from-url
  namespace: hello
  labels:
    app: grafana
spec:
  url: https://raw.githubusercontent.com/integr8ly/grafana-operator/v4/deploy/examples/remote/grafana-dashboard.json
---
apiVersion: v1
kind: Namespace
metadata:
  creationTimestamp: "2023-05-11T06:25:34Z"
  labels:
    kubernetes.io/metadata.name: hello
    partition: new-hope
  name: hello

I will take a look at your PR when I get time.
Thanks, @Yaytay

@Yaytay
Copy link
Author

Yaytay commented May 11, 2023

Over the weekend someone did some work on this for the bitnami chart:
bitnami/charts#16360

It looks like scanAllNamespaces and scanNamespaces need watch and list, but I think that dashboardNamespaceSelector needs get as well.
Hoping to check that this morning, but if so it means that my PR needs some work.

@Yaytay
Copy link
Author

Yaytay commented May 11, 2023

It looks like I'm wrong and get is not required at all, though as get and list/watch provide the same access to data I'm not sure it makes much difference and it just comes down to precisely how the operator makes the request.
In other words, I think it's probably best to enable get/list/watch, but right now it seems that only list/watch is needed.

@NissesSenap
Copy link
Collaborator

Solved in #1041

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one. v4 Major version 4
Projects
None yet
Development

No branches or pull requests

2 participants