For questions, doubts, guidances please use Discussions. Don't open a new Issue. #91

carlosedp · 2020-08-20T19:40:13Z

Since I don't have too many resources and time to address all questions regarding the deployments, the Issues section is a place to report problems or improvements to the stack.

This issue is a place where you can add a comment in case of a question where me or any community member can answer in a best effort manner.

If you deployed the monitoring stack and some targets are not available or showing no metrics in Grafana, make sure you don't have IPTables rules or use a firewall on your nodes before deploying Kubernetes.

If you don't want to receive further notifications, click "Unsubscribe" in the right bar, right above the participants list.

YushchenkoAndrew · 2020-08-21T15:37:10Z

I faced with an Issue, in which I couldn't open grafana and prometheus applications (link https://grafana.192.168.0.106.nip.io)

 $ curl http://prometheus.192.168.0.106.nip.io
 curl: (7) Failed to connect to prometheus.192.168.0.106.nip.io port 80: Connection refused
 $ curl https://prometheus.192.168.0.106.nip.io
 curl: (7) Failed to connect to prometheus.192.168.0.106.nip.io port 443: Connection refused

In the browser I got same Issue "Unable to connect".

I'm using k3s and I configured my master ip address
192.168.0.106 - it's a local ip address one of my workers node

I managed to successfully deploy all pods but I don't know how do I need to connect to the applications

 $ kubectl get ingress -n monitoring
 NAME                CLASS    HOSTS                               ADDRESS   PORTS     AGE
 alertmanager-main   <none>   alertmanager.192.168.0.106.nip.io             80, 443   54s
 grafana             <none>   grafana.192.168.0.106.nip.io                  80, 443   54s
 prometheus-k8s      <none>   prometheus.192.168.0.106.nip.io               80, 443   53s

 $ kubectl get pods -n monitoring
 NAME                                   READY   STATUS    RESTARTS   AGE
 prometheus-operator-6b8868d698-6xlvg   2/2     Running   0          14m
 arm-exporter-wmm6r                     2/2     Running   0          14m
 arm-exporter-67jpd                     2/2     Running   0          14m
 node-exporter-fbltt                    2/2     Running   0          14m
 alertmanager-main-0                    2/2     Running   0          14m
 arm-exporter-zhd5m                     2/2     Running   0          14m
 node-exporter-pzz6z                    2/2     Running   0          14m
 node-exporter-74fwt                    2/2     Running   0          14m
 grafana-7466bcc7c5-4hvpj               1/1     Running   0          14m
 kube-state-metrics-96bf99844-g9ssn     3/3     Running   0          14m
 prometheus-adapter-f78c4f4ff-kccbq     1/1     Running   0          14m
 prometheus-k8s-0                       3/3     Running   0          14m

Do you have any suggestions ?

carlosedp · 2020-08-21T18:14:53Z

You need to troubleshoot the access to your K3s cluster ingress that bridges the outside HTTP/HTTPS traffic to the pods.

Here is a reference: https://rancher.com/docs/k3s/latest/en/networking/

Have you deployed any application that uses HTTP(like NGINX, Apache) and been able to access it from your computer? It's similar to access Prometheus, Grafana and AlertManager.

YushchenkoAndrew · 2020-08-21T18:27:03Z

Yes, I created own blog site on JS but I didn't use ingress, I configured externalIP on Service, so... I will try to troubleshoot this issue.
Thanks for replay!

YushchenkoAndrew · 2020-08-21T22:01:48Z

I solve this issue. Thank for advice, at the end I just installed nginx, configured it and after that I was able to access to prometheus and grafana.
Thanks a lot!

johnfried · 2020-09-08T05:04:12Z

Love this project! I am unable to access prometheus.*.nip.io . I can access both Grafana and Alert manager. My ingress shows Prometheus, and is setup correctly. The one odd thing is when I look at all my pods in Monitoring ns; I do not have Prometheus-K8s (or something along those lines that I have seen in videos). The pods I have are Prometheus Adapter and Operator. I have Re-ran make vendor and deployed them. Same thing, again no errors anywhere. And also Prometheus-K8s has a service as I just looked. Does this make any sense? TIA

exArax · 2020-09-15T18:16:08Z

Is there a way to deploy grafana and prometheus pods to the master node only ? Because sometimes they are deployed to workers

carlosedp · 2020-09-15T19:41:07Z

@exArax You need to set your master nodes as schedulable. Even this way, Kubernetes can deploy the pods to other nodes. If you need to set to a specific set of nodes, you need pod affinity.

carlosedp · 2020-09-15T19:42:00Z

Love this project! I am unable to access prometheus.*.nip.io . I can access both Grafana and Alert manager. My ingress shows Prometheus, and is setup correctly. The one odd thing is when I look at all my pods in Monitoring ns; I do not have Prometheus-K8s (or something along those lines that I have seen in videos). The pods I have are Prometheus Adapter and Operator. I have Re-ran make vendor and deployed them. Same thing, again no errors anywhere. And also Prometheus-K8s has a service as I just looked. Does this make any sense? TIA

Doesn't make too much sense since the pods are created by the operator. Re-check your cluster and re-deploy the stack.

johnfried · 2020-09-15T19:43:28Z

I redeployed and all is well, thank you

exArax · 2020-09-15T20:35:08Z

@exArax You need to set your master nodes as schedulable. Even this way, Kubernetes can deploy the pods to other nodes. If you need to set to a specific set of nodes, you need pod affinity.

In case of grafana, I have to add the node affinity on the grafana-deployment.yaml that is inside the manifests folder, right?

ClauNav · 2020-09-23T05:21:16Z

Hello Carlos,
I've the same issue as YushchenkoAndrew.
I'm noob on Kubernetes (I built this cluster to learn about it)

The same issue on Alertmanager/Prometheus.

Could you please help me?

Thanks.

carlosedp · 2020-09-23T13:17:30Z

@exArax You need to set your master nodes as schedulable. Even this way, Kubernetes can deploy the pods to other nodes. If you need to set to a specific set of nodes, you need pod affinity.

In case of grafana, I have to add the node affinity on the grafana-deployment.yaml that is inside the manifests folder, right?

Yes, since the jsonnet code doesn't have the pod affinity for this.

carlosedp · 2020-09-23T13:18:28Z

Hello Carlos,
I've the same issue as YushchenkoAndrew.
I'm noob on Kubernetes (I built this cluster to learn about it)

The same issue on Alertmanager/Prometheus.

Could you please help me?

Thanks.

You need to make sure your Kubernetes cluster has an Ingress controller and can expose the applications. Check this first with something like an NGINX pod with a simple Hello World web page.

Nenad13 · 2020-09-24T20:39:10Z

Hi Carlos,
Very cool project indeed. I am running Kubernetes on Ubuntu 20.04.1 (master) and a few of Raspberry Pi 4 (nodes) with raspbian on them. I installed Kubernetes with ansible playbook and it works fine.
I made all changes in vars.jsonnet as you suggested. The problem is after. make deploy I am getting this error:

root@asus:~/cluster-monitoring# make deploy
echo "Deploying stack setup manifests..."
Deploying stack setup manifests...
kubectl apply -f ./manifests/setup/
The connection to the server localhost:8080 was refused - did you specify the right host or port?
make: *** [Makefile:37: deploy] Error 1

Do you have any suggestions?

This is the configuration:
kubectl config view
apiVersion: v1
clusters:

cluster:
certificate-authority-data: DATA+OMITTED
server: https://x.x.x.x:6443
name: default
contexts:
context:
cluster: default
user: default
name: default
current-context: default
kind: Config
preferences: {}
users:
name: default
user:
password: xxxxxxx
username: xxxxxxxx

Thank you in advance!

ClauNav · 2020-09-25T03:39:53Z

Hello Carlos,
I've the same issue as YushchenkoAndrew.
I'm noob on Kubernetes (I built this cluster to learn about it)

The same issue on Alertmanager/Prometheus.
Could you please help me?
Thanks.

You need to make sure your Kubernetes cluster has an Ingress controller and can expose the applications. Check this first with something like an NGINX pod with a simple Hello World web page.

Hello Carlos, You're right!
Thanks for taking your time replaying our newbies questions.

riolaf05 · 2020-10-02T13:48:59Z

Hello, I have some problems with installation on K3s.

After the deploy operation, not all the services are installed:

Also, I am getting this error from the prometheus adapted container:

Do you have any idea on what can I I do? Thank you.

carlosedp · 2020-10-05T21:30:40Z

Hello again,

I want to add some authentication and authorization on prometheus.192.168.1.x.nip.io. Is there a way to do something like this prometheus.io/docs/guides/tls-encryption on the prometheus.192.168.1.x.nip.io ?

You need an ingress controller that supports authentication. look at

cluster-monitoring/base_operator_stack.jsonnet

Line 168 in 5ead754

// // Example external ingress with authentication

. It works with Traefik but might need a couple changes.

carlosedp · 2020-10-05T21:32:06Z

Hello, I have some problems with installation on K3s.

After the deploy operation, not all the services are installed:

Also, I am getting this error from the prometheus adapted container:

Do you have any idea on what can I I do? Thank you.

Sorry, so many variables that it's hard to know. Start deploying a test application, check your node IPs and so on.

robmit68 · 2020-10-07T02:12:57Z

Hi Carlos,
i have followup the Cluster Monitoring deployment step by step and is running successfully, i am trying to to utilize Prometheus generator withing the node prometheus.192.168.XXX.XXX.nip.io to generate a Cisco SNMP scrape config and i am not able to access the node via ssh.
How can i access the node to add scrapes/targets to the Prometheus k3s node?
i am newbie in k3s and looking forward to your response.
Regards

Robe

carlosedp · 2020-10-07T19:17:19Z

Hi Carlos,
i have followup the Cluster Monitoring deployment step by step and is running successfully, i am trying to to utilize Prometheus generator withing the node prometheus.192.168.XXX.XXX.nip.io to generate a Cisco SNMP scrape config and i am not able to access the node via ssh.
How can i access the node to add scrapes/targets to the Prometheus k3s node?
i am newbie in k3s and looking forward to your response.
Regards

Robe

To collect metrics from SNMP you need the snmp_exporter. It's out of scope of this stack but take a look at another project I have here: https://github.com/carlosedp/ddwrt-monitoring. It's not in Kubernetes but I use it for SNMP.

robmit68 · 2020-10-08T02:02:01Z

Thank you Carlos

exArax · 2020-10-08T09:55:48Z

Hello again,

I want to add some authentication on prometheus.192.168.1.x.nip.io. Is there a way to do something like this https://prometheus.io/docs/guides/basic-auth/ or https://www.openshift.com/blog/adding-authentication-to-your-kubernetes-web-applications-with-keycloak on the prometheus.192.168.1.x.nip.io ? I do not know which file I have to edit to add authentication to Prometheus.

carlosedp · 2020-10-08T13:29:43Z

As I mentioned before, the stack doesn't have anything built-in to provide authentication but you could change the ingresses to you your ingress controller (Traefik, HAProxy, etc) to add a layer of authentication.

Another option is similar to the post you linked to but that would require adding the keycloak sidecar to every pod.

justinwagg · 2020-10-09T00:08:05Z

Firstly, thanks for all the work you put into this @carlosedp 👏🏻. Prometheus seems to be running into an error panic: mmap: cannot allocate memory, have you run into this before? Deleting the pod fixes the issue, and I do have memory available. Also - what is the best way to add additional targets? Thanks again

root@pi-master:/home/pi# kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.5+k3s1", GitCommit:"58ebdb2a2ec5318ca40649eb7bd31679cb679f71", GitTreeState:"clean", BuildDate:"2020-05-06T23:42:31Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/arm"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.5+k3s1", GitCommit:"58ebdb2a2ec5318ca40649eb7bd31679cb679f71", GitTreeState:"clean", BuildDate:"2020-05-06T23:42:31Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/arm"}
root@pi-master:/home/pi#
root@pi-master:/home/pi# cat /etc/os-release
PRETTY_NAME="Raspbian GNU/Linux 10 (buster)"
NAME="Raspbian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
root@pi-master:/home/pi#

exArax · 2020-10-09T11:31:58Z

@carlosedp to change the ingresses I have to edit only the ingress-XXXX.yaml files or are there more files that I have to edit ?

jontg · 2020-10-09T17:25:23Z

Hey @carlosedp I was wondering if you have any interest in seeing loki ("Prometheus, but for logs") added to this tech stack? I was thinking of taking a stab at it this coming Monday

thomazBDRI · 2020-10-21T00:25:20Z

Hey @carlosedp really thanks for this stack i am using this in a few clusters that i have! One question though, how do i add a new job into prometheus? I didn't find anything describing the jobs!

urbaned121 · 2020-10-21T09:26:17Z

Hey @carlosedp really thanks for this stack i am using this in a few clusters that i have! One question though, how do i add a new job into prometheus? I didn't find anything describing the jobs!

I came here with the same question...
prometheus-config-reloader pod has directory /etc/prometheus/config where is prometheus.yaml.gz file
but I have no idea hot to update it to add new job.
I can not find config map related to that file.
@carlosedp any advise? :)
Thanks!

lauchokyip · 2021-06-16T02:43:38Z

@carlosedp Kubernetes maintainers changed Ingress from extensions/v1beta1 to networking.k8s.io/v1
A quick and dirty way is to open Ingress-*.yaml and change the networking.k8s.io/v1 to extensions/v1beta1

However, after K8s releases 1.22, this method will fail

For long term fix:

Ingress-alertmanager.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: alertmanager-main
  namespace: monitoring
spec:
  tls:
  - hosts:
    -  alertmanager.192.168.1.15.nip.io
  rules:
  - host:  alertmanager.192.168.1.15.nip.io
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: alertmanager-main
            port:
              name: web

Ingress-grafana.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: grafana
  namespace: monitoring
spec:
  tls:
  - hosts:
    - grafana.192.168.1.15.nip.io
  rules:
  - host: grafana.192.168.1.15.nip.io
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: grafana
            port:
              name: http

Ingress-promethus.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prometheus-k8s
  namespace: monitoring
spec:
  tls:
  - hosts:
    - prometheus.192.168.1.15.nip.io
  rules:
  - host: prometheus.192.168.1.15.nip.io
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-k8s
            port: 
              name: web

Make sure you change the host

hugobloem · 2021-06-19T13:47:08Z

Hi there,

I am learning kubernetes, so I deployed a small cluster on some Raspberry Pis. However, I cannot reach my grafana instance (grafana.192.168.1.100.nip.io). I updated the ingress files following the post above, but to no avail.

Does anyone have any suggestions on what to do?

Cheers!

exArax · 2021-06-21T07:42:55Z

Hi,

I configured K3s with MetalLB and for some reason the ingress now doesn't work, is there a way to make the prometheus-k8s-0 to use the hostNetwork: true option? I have added in the prometheus-prometheus.yaml in the spec section but it doesn't seem to work

Fred0211 · 2021-07-05T04:21:56Z

Hello all,
Thank you for making this project, especially for ARM users! I'm learning/running microk8s and have managed to get all nodes deployed and running. microk8s.kubectl get ingress --all-namespaces outputs that the hosts should be up and running.

However i'm not able to connect in browser. I'm aware microk8s isn't officially supported so unsure if it is an issue with this version of kubernetes. This has happened with and without applying the fixes for ingress-*.yml files.

Thank you!

pomcho555 · 2021-07-08T11:14:52Z

Thank you for this amazing project.

I've followed the tutorial here:
https://kauri.io/#deploy-prometheus-and-grafana-to-monitor-a-kube/186a71b189864b9ebc4ef7c8a9f0a6b5/a

But I've found a fatal error while make deploy.
I disabled the ingress in the vars.jsonnet but I still get the same error:

error validating "manifests/ingress-alertmanager.yaml": error validating data: [ValidationError(Ingress.spec.rules[0].http.paths[0].backend): unknown field "serviceName" in io.k8s.api.networking.v1.IngressBackend, ValidationError(Ingress.spec.rules[0].http.paths[0].backend): unknown field "servicePort" in io.k8s.api.networking.v1.IngressBackend]; if you choose to ignore these errors, turn validation off with --validate=false
error validating "manifests/ingress-grafana.yaml": error validating data: [ValidationError(Ingress.spec.rules[0].http.paths[0].backend): unknown field "serviceName" in io.k8s.api.networking.v1.IngressBackend, ValidationError(Ingress.spec.rules[0].http.paths[0].backend): unknown field "servicePort" in io.k8s.api.networking.v1.IngressBackend]; if you choose to ignore these errors, turn validation off with --validate=false
error validating "manifests/ingress-prometheus.yaml": error validating data: [ValidationError(Ingress.spec.rules[0].http.paths[0].backend): unknown field "serviceName" in io.k8s.api.networking.v1.IngressBackend, ValidationError(Ingress.spec.rules[0].http.paths[0].backend): unknown field "servicePort" in io.k8s.api.networking.v1.IngressBackend]; if you choose to ignore these errors, turn validation off with --validate=false

I have K3s version v1.20.7+k3s1
Thanks

I had a same error on k3s version v1.21.2+k3s1 (5a67e8dc) go version go1.16.4

There are 3 master node on EC2 and jetson nano(arm64) via the VPN network, and else are Raspberry Pi arm64.

$sudo kubectl get node
NAME               STATUS     ROLES                  AGE   VERSION
ip-xxx-xxx-xxx-xxx   Ready      control-plane,master   14d   v1.21.1+k3s1
ip-yyy-yyy-yyy-yyy    Ready      control-plane,master   14d   v1.21.1+k3s1
pi4-node2          Ready      <none>                 33m   v1.21.2+k3s1
jetson-master      Ready      control-plane,master   14d   v1.21.2+k3s1
pi4-node1          Ready      <none>                 37m   v1.21.2+k3s1

Thanks

onedr0p · 2021-07-12T14:52:00Z

I don't see how this can work with later version of k3s, since they disabled metrics listening on any interface other than 127.0.0.1

k3s-io/k3s#425
k3s-io/k3s@4808c4e

jevy · 2021-07-13T21:56:35Z

All of my prometheus exports are working except for these two. I've tried turning the TLS settings on/off in the vars.json with no change. Cluster: 5 RPi4. One exporter from the same IP works, but another one doesn't. Any idea what to check?

ToMe25 · 2021-07-13T23:45:05Z

I think those two were always dead when I looked at those stats too, they might just not work with k3s?
I couldn't find anything that doesn't work with them down, nor any missing stats, tho so maybe its fine.

exArax · 2021-07-15T08:33:15Z

Hi,

I want to monitor Minio metrics, which files I have to edit to make Prometheus scrape data from it?

carlosedp · 2021-07-15T13:45:25Z

Hi,

First of all, thank you for this great project.
I've successfully deployed the monitoring and now I would like to add another rpi to Prometheus scrapping. This rpi is outside the k8s cluster and already have node_exporter installed.
How to I had this node to prometheus?

This stack is built to monitor cluster nodes. New added nodes will be automatically monitored.

carlosedp · 2021-07-15T13:46:33Z

Hi,
Everyhting works fine. Thanks a lot for this cool repo!

One question. Where can I add additionalScrapeConfigs?

Best,
Gregor

Check the modules directory where I have additional scrapers that can be enabled in vars.json.
https://github.com/carlosedp/cluster-monitoring/tree/master/modules

carlosedp · 2021-07-15T13:47:43Z

@DrumSergio @lauchokyip yes, since Ingress API changed, old version was deprecated in 1.22. I'd welcome a PR :)

carlosedp · 2021-07-15T13:50:05Z

@ToMe25 @jevy the endpoints changed a bit in the past versions, might need adjusts in the k3s-overrides.jsonnet that creates the endpoints for them.

braucktoon · 2021-07-22T00:23:18Z

Hi,

Nice work!

Is it possible to add the speed-test-exporter (https://docs.miguelndecarvalho.pt/projects/speedtest-exporter) to this deployment? Seems like I need to add the exporter to the prometheus.yml but this file is buried in the prometheus-k8s-0 pod and not exposed. Any guidance if someone has already done it?

Thanks

jamessewell · 2021-07-22T03:22:54Z

Love the project - I'm wondering how I go about updating though?

I want to get AlertmanagerConfig - which isn't available in the vendored versions which are locked (I know you can use the secret).

Basically I just want to know if it's safe to bump versions and what I should be looking out for - I haven't had much luck so far.

mrimp · 2021-09-23T18:10:44Z

arm-exporter-rhtfh 0/2 ContainerCreating 0 27m arm-exporter-h4kr7 0/2 ContainerCreating 0 27m arm-exporter-g5hsd 0/2 ContainerCreating 0 27m node-exporter-fdxwm 2/2 Running 0 27m arm-exporter-b69l4 2/2 Running 0 27m arm-exporter-hfbjk 2/2 Running 0 27m node-exporter-8rm9g 2/2 Running 0 27m prometheus-adapter-585b57857b-b54k9 1/1 Running 0 27m node-exporter-ck5j9 2/2 Running 0 27m arm-exporter-mxhrb 2/2 Running 0 27m node-exporter-ct28t 2/2 Running 0 27m node-exporter-nh2cr 2/2 Running 0 27m node-exporter-7n5jc 2/2 Running 0 27m grafana-7bc4784744-q6xzp 1/1 Running 0 27m prometheus-operator-67755f959-gp44r 1/2 CrashLoopBackOff 9 27m kube-state-metrics-6cb6df5d4-sv9zv 2/3 CrashLoopBackOff 9 27m
raspberry pi 4 its a reinstall was working prior thx
ARM64

assapir · 2021-09-23T21:40:25Z

Hey, I am not sure where my problem is.

I am unable to connect to any of the ingresses.
My DNS is pi.hole server, which yields no answer for grafana.192.168.1.103.nip.io, so I manually added it as a DNS record, and I think it now returns the right answer:

dig  +short prometheus.192.168.1.103.nip.io
grafana.192.168.1.103.nip.io.
192.168.1.103

but opening the browser on that address fails.
I don't think I have DNS rebinding protection enabled.

All pods seems to be running:

❯ kubectl -n monitoring get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE   IP              NODE          NOMINATED NODE   READINESS GATES
node-exporter-hx9hs                   2/2     Running   2          49m   192.168.1.103   k3s-master    <none>           <none>
arm-exporter-r8xtd                    2/2     Running   2          50m   10.42.0.9       k3s-master    <none>           <none>
prometheus-operator-67755f959-vrwrk   2/2     Running   4          50m   10.42.1.12      k3s-node-01   <none>           <none>
arm-exporter-vnw7f                    2/2     Running   4          50m   10.42.1.11      k3s-node-01   <none>           <none>
prometheus-k8s-0                      3/3     Running   4          28m   10.42.1.13      k3s-node-01   <none>           <none>
node-exporter-dnrrb                   2/2     Running   4          49m   192.168.1.106   k3s-node-01   <none>           <none>
alertmanager-main-0                   2/2     Running   2          50m   10.42.3.7       k3s-node-03   <none>           <none>
node-exporter-fwk5c                   2/2     Running   2          49m   192.168.1.115   k3s-node-03   <none>           <none>
arm-exporter-9gqxz                    2/2     Running   2          50m   10.42.3.5       k3s-node-03   <none>           <none>
prometheus-adapter-585b57857b-xcp2j   1/1     Running   1          49m   10.42.3.6       k3s-node-03   <none>           <none>
arm-exporter-98npp                    2/2     Running   4          50m   10.42.2.10      k3s-node-02   <none>           <none>
kube-state-metrics-6cb6df5d4-przcx    3/3     Running   6          49m   10.42.2.8       k3s-node-02   <none>           <none>
node-exporter-p2w6r                   2/2     Running   4          49m   192.168.1.111   k3s-node-02   <none>           <none>
grafana-7bc4784744-68nzs              1/1     Running   2          49m   10.42.2.9       k3s-node-02   <none>           <none>

mrimp · 2021-10-01T18:30:14Z

Hey, I am not sure where my problem is.

I am unable to connect to any of the ingresses. My DNS is pi.hole server, which yields no answer for grafana.192.168.1.103.nip.io, so I manually added it as a DNS record, and I think it now returns the right answer:

dig  +short prometheus.192.168.1.103.nip.io
grafana.192.168.1.103.nip.io.
192.168.1.103

but opening the browser on that address fails. I don't think I have DNS rebinding protection enabled.

All pods seems to be running:

❯ kubectl -n monitoring get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE   IP              NODE          NOMINATED NODE   READINESS GATES
node-exporter-hx9hs                   2/2     Running   2          49m   192.168.1.103   k3s-master    <none>           <none>
arm-exporter-r8xtd                    2/2     Running   2          50m   10.42.0.9       k3s-master    <none>           <none>
prometheus-operator-67755f959-vrwrk   2/2     Running   4          50m   10.42.1.12      k3s-node-01   <none>           <none>
arm-exporter-vnw7f                    2/2     Running   4          50m   10.42.1.11      k3s-node-01   <none>           <none>
prometheus-k8s-0                      3/3     Running   4          28m   10.42.1.13      k3s-node-01   <none>           <none>
node-exporter-dnrrb                   2/2     Running   4          49m   192.168.1.106   k3s-node-01   <none>           <none>
alertmanager-main-0                   2/2     Running   2          50m   10.42.3.7       k3s-node-03   <none>           <none>
node-exporter-fwk5c                   2/2     Running   2          49m   192.168.1.115   k3s-node-03   <none>           <none>
arm-exporter-9gqxz                    2/2     Running   2          50m   10.42.3.5       k3s-node-03   <none>           <none>
prometheus-adapter-585b57857b-xcp2j   1/1     Running   1          49m   10.42.3.6       k3s-node-03   <none>           <none>
arm-exporter-98npp                    2/2     Running   4          50m   10.42.2.10      k3s-node-02   <none>           <none>
kube-state-metrics-6cb6df5d4-przcx    3/3     Running   6          49m   10.42.2.8       k3s-node-02   <none>           <none>
node-exporter-p2w6r                   2/2     Running   4          49m   192.168.1.111   k3s-node-02   <none>           <none>
grafana-7bc4784744-68nzs              1/1     Running   2          49m   10.42.2.9       k3s-node-02   <none>           <none>

Try kubectl get ingress -n monitoring
if your getting this error it will never load.

Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress

still exist today

assapir · 2021-10-04T14:27:24Z

How does one can create a custom scarper? I have my own metrics endpoint and I want Prometheus to scrape it, how can I add that?

Cian911 · 2021-10-08T18:52:56Z

Hi,

Nice work!

Is it possible to add the speed-test-exporter (https://docs.miguelndecarvalho.pt/projects/speedtest-exporter) to this deployment? Seems like I need to add the exporter to the prometheus.yml but this file is buried in the prometheus-k8s-0 pod and not exposed. Any guidance if someone has already done it?

Thanks

I've created a new PR for just this here: #116 cc: @carlosedp

Cian911 · 2021-10-08T18:59:53Z

It seems that the ksonnet-lib project which this project heavily utilises is no longer active and the project has since been archived.

I came across a newer project which has been kept up to date https://github.com/jsonnet-libs/k8s-libsonnet and gave it a shot porting this project over to using it, but with not much success (I'm not very familiar with jsonnet). By the looks of things it might be a bit of work to do it. Wondering what your thoughts are @carlosedp and if you might have the time to update?

ToMe25 · 2021-10-12T13:52:10Z

@carlosedp I have locally updated all dependencies, once to the latest kube-prometheus version, and once to version 0.8, which is the last one that supports Kubernetes 1.20.
I have tweaked it until it runs, and collects some data, however a lot of data seems missing and many of the dashboards aren't working because of it.
Fixing this is something I am unfortunately unable to do.

Should I push this as a Draft PR as a starting point for someone else to continue, or discard it since I couldn't get it to work?
Also if I do push it, which of the two versions should I push?

radicalgeek · 2021-10-22T13:02:16Z

@ToMe25 @jevy the endpoints changed a bit in the past versions, might need adjusts in the k3s-overrides.jsonnet that creates the endpoints for them.

@carlosedp Any idea what the end points need to look like? It seems they now only respond when accessed over localhost/127.0.0.1 and not over the cluster ip for the manager (even locally). I tried to change the IP to 127.0.0.1 in the endpoint manifest for kube control manager but it won't allow it. Any idea how we can fix this?

exArax · 2021-10-22T14:19:37Z

Hi @carlosedp ,

I want to add in this stack a pod thati s actually an exporter for Kubevirt. The procedure requires to add a deployment file and service file for sure, but how would I configure Prometheus to scrape from kubevirt-prometheus-metrics? Which file should I edit?

Update If found that I don't need an exporter. Based on thishttps://kubevirt.io/user-guide/operations/component_monitoring/ I can integrate Kubevirt with Prometheus Operator. The thing is that I am applying to Kubevirt the namespace monitoring and the service account of the Operator (prometheus-operator) and I can't see the metrics that start with kubevirt_vmi.

ToMe25 · 2021-10-22T15:49:49Z

@ToMe25 @jevy the endpoints changed a bit in the past versions, might need adjusts in the k3s-overrides.jsonnet that creates the endpoints for them.

@carlosedp Any idea what the end points need to look like? It seems they now only respond when accessed over localhost/127.0.0.1 and not over the cluster ip for the manager (even locally). I tried to change the IP to 127.0.0.1 in the endpoint manifest for kube control manager but it won't allow it. Any idea how we can fix this?

@radicalgeek I actually locally tweaked modules/k3s-overrides.jsonnnet until the generated manifests looked perfect to me, and the result was this:

So unfortunately I don't think I know enough about this issue to fix this.

I compared the generated manifests with those for kube-dns, and couldn't find any difference that I could explain this with.
This is my modified modules/k3s-overrides.jsonnet file in case you want to play around with it:

modules/k3s-overrides.jsonnet

local utils = import '../utils.libsonnet';
local vars = import '../vars.jsonnet';
local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local service = k.core.v1.service;
local servicePort = k.core.v1.service.mixin.spec.portsType;

{
  prometheus+:: {
    kubeControllerManagerPrometheusDiscoveryService:
      service.new('kube-controller-manager-prometheus-discovery', { 'k8s-app': 'kube-controller-manager' }, servicePort.newNamed('http-metrics', 10252, 10252)) +
      service.mixin.metadata.withNamespace('kube-system') +
      service.mixin.metadata.withLabels({ 'k8s-app': 'kube-controller-manager' }) +
      service.mixin.spec.withClusterIp('None'),

    kubeSchedulerPrometheusDiscoveryService:
      service.new('kube-scheduler-prometheus-discovery', { 'k8s-app': 'kube-scheduler' }, servicePort.newNamed('http-metrics', 10251, 10251)) +
      service.mixin.metadata.withNamespace('kube-system') +
      service.mixin.metadata.withLabels({ 'k8s-app': 'kube-scheduler' }) +
      service.mixin.spec.withClusterIp('None'),
  },
}

I also tried modifying the endpoint manifests to reference the right k8s-app instead of removing them, but that didn't fix it either.

exArax · 2021-10-29T19:51:59Z

Hi again,

I am deploying a VM through Kubevirt, which has cirros inside with node exporter. How can I add an endpoint for this exporter to Prometheus?

carlosedp · 2021-11-10T12:39:26Z

Hi all, I've just enabled Discussions on the project so we can have a proper forum for talking about it.

Please head to https://github.com/carlosedp/cluster-monitoring/discussions.

I won't close this issue so it can be a pointer for new people.
Thanks everyone!

carlosedp added the question Further information is requested label Aug 20, 2020

carlosedp pinned this issue Aug 20, 2020

Repository owner locked as resolved and limited conversation to collaborators Nov 10, 2021

carlosedp changed the title ~~Questions, doubts, guidances goes into a comment here. Don't open a new Issue.~~ For questions, doubts, guidances please use Discussions. Don't open a new Issue. Nov 10, 2021

carlosedp closed this as completed Nov 10, 2021

This issue was moved to a discussion.

For questions, doubts, guidances please use Discussions. Don't open a new Issue. #91

For questions, doubts, guidances please use Discussions. Don't open a new Issue. #91

Comments

carlosedp commented Aug 20, 2020 • edited Loading

YushchenkoAndrew commented Aug 21, 2020 • edited Loading

carlosedp commented Aug 21, 2020

YushchenkoAndrew commented Aug 21, 2020

YushchenkoAndrew commented Aug 21, 2020

johnfried commented Sep 8, 2020 • edited Loading

exArax commented Sep 15, 2020 • edited Loading

carlosedp commented Sep 15, 2020

carlosedp commented Sep 15, 2020

johnfried commented Sep 15, 2020

exArax commented Sep 15, 2020

ClauNav commented Sep 23, 2020

carlosedp commented Sep 23, 2020

carlosedp commented Sep 23, 2020

Nenad13 commented Sep 24, 2020

ClauNav commented Sep 25, 2020

riolaf05 commented Oct 2, 2020

carlosedp commented Oct 5, 2020

carlosedp commented Oct 5, 2020

robmit68 commented Oct 7, 2020

carlosedp commented Oct 7, 2020

robmit68 commented Oct 8, 2020

exArax commented Oct 8, 2020 • edited Loading

carlosedp commented Oct 8, 2020

justinwagg commented Oct 9, 2020

exArax commented Oct 9, 2020

jontg commented Oct 9, 2020

thomazBDRI commented Oct 21, 2020

urbaned121 commented Oct 21, 2020

lauchokyip commented Jun 16, 2021 • edited Loading

hugobloem commented Jun 19, 2021

exArax commented Jun 21, 2021

Fred0211 commented Jul 5, 2021 • edited Loading

pomcho555 commented Jul 8, 2021

onedr0p commented Jul 12, 2021 • edited Loading

jevy commented Jul 13, 2021

ToMe25 commented Jul 13, 2021

exArax commented Jul 15, 2021

carlosedp commented Jul 15, 2021

carlosedp commented Jul 15, 2021

carlosedp commented Jul 15, 2021

carlosedp commented Jul 15, 2021

braucktoon commented Jul 22, 2021

jamessewell commented Jul 22, 2021

mrimp commented Sep 23, 2021 • edited Loading

assapir commented Sep 23, 2021 • edited Loading

mrimp commented Oct 1, 2021

assapir commented Oct 4, 2021

Cian911 commented Oct 8, 2021 • edited Loading

Cian911 commented Oct 8, 2021

ToMe25 commented Oct 12, 2021

radicalgeek commented Oct 22, 2021

exArax commented Oct 22, 2021 • edited Loading

ToMe25 commented Oct 22, 2021

exArax commented Oct 29, 2021

carlosedp commented Nov 10, 2021

This issue was moved to a discussion.

carlosedp commented Aug 20, 2020 •

edited

Loading

YushchenkoAndrew commented Aug 21, 2020 •

edited

Loading

johnfried commented Sep 8, 2020 •

edited

Loading

exArax commented Sep 15, 2020 •

edited

Loading

exArax commented Oct 8, 2020 •

edited

Loading

lauchokyip commented Jun 16, 2021 •

edited

Loading

Fred0211 commented Jul 5, 2021 •

edited

Loading

onedr0p commented Jul 12, 2021 •

edited

Loading

mrimp commented Sep 23, 2021 •

edited

Loading

assapir commented Sep 23, 2021 •

edited

Loading

Cian911 commented Oct 8, 2021 •

edited

Loading

exArax commented Oct 22, 2021 •

edited

Loading