Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring kubernetes with prometheus from outside of k8s cluster. #4633

Open
mardicas opened this Issue Sep 19, 2018 · 10 comments

Comments

Projects
None yet
8 participants
@mardicas
Copy link

mardicas commented Sep 19, 2018

Proposal

The goal of this ticket is to understand how it would be possible or correct way to run prometheus outside of the k8s cluster being monitored. Or what kind of additonal development this would require.

Background

It is a common practice to not run the monitoring software on the stack that is being monitored.
It is important because during outages/problems with the cluster, prometheus might not be working or accessible, leaving the administrator in the blind while solving issues.
Also the case of having multiple clusters to monitor, but wanting to have a centralized prometheus setup.

Acceptable solutions

  1. Prometheus configured against kubernetes API, similar manner as in kubectl works(provide host, client-certificate-data and client-key-data.
  2. Run some sort of proxy inside of kubernetes cluster that takes care of tokens, discovery and accessing network inside of the cluster. So you configure central prometheus against this proxy, instead of kubernetes api and it provides the metrics from the cluster.
  3. Provide instructions/documentation on how to use the current prometheus kubernetes_sd_configs opton to acheve the similar result

#2430
In the end of it there are several users with this issue.

Having the kubernetes interanal network available on the monitoring server is not a desired solution because:

  1. Multiple clusters might use the same IP ranges - so routing becomes complicated.
  2. The monitoring server can be in another location or "zone" - so it might create latency issues for the entire network (depending on the solution used).
@mardicas

This comment has been minimized.

Copy link
Author

mardicas commented Sep 19, 2018

There is some discussion here: https://stackoverflow.com/questions/41845307/prometheus-cannot-export-metrics-from-connected-kubernetes-cluster/47643005
I am already using:
https://github.com/kubernetes/kube-state-metrics , but it is not providing cpu/memory usage of pods and so on.

@mardicas

This comment has been minimized.

Copy link
Author

mardicas commented Sep 20, 2018

In this issue there is a comment of how to get cadvisor stats into prometheus - works from external prometheus.
giantswarm/prometheus#89 (comment)

If I combine this with kube-state-metrics, then I have what is needed.
Now only thing that would be nice to add is how to get the node hostnames using kubernetes_sd_configs.

  1. Could it support client-key-data instead of token?
  2. Should the token be generated by a separate kubectl command?

Current configuration is:
Installed https://github.com/kubernetes/kube-state-metrics on cluster and exposed it on NodePort service (our clusters are not reachable from outside),could also use kubectl proxy from the prometheus machine)

  - job_name: kubernetes-metrics
    static_configs:
      - targets: ['kube-master-1.example:8080']
      - targets: ['kube-master-1.internal:8080']
  - job_name: kubernetes-cadvisor
    metrics_path: "/metrics/cadvisor"
    static_configs:
      - targets: ['kube-master-1.example:10255']
      - targets: ['kube-master-1.internal:10255']
      - targets: ['kube-worker-1.example:10255']
      - targets: ['kube-worker-2.example:10255']
      - targets: ['kube-worker-1.internal:10255']
      - targets: ['kube-worker-2.internal:10255']

It would be nice to use kubernetes_sd_configs to get the cadvisor nodes.

@pulord

This comment has been minimized.

Copy link

pulord commented Nov 1, 2018

In this issue there is a comment of how to get cadvisor stats into prometheus - works from external prometheus.
giantswarm/kubernetes-prometheus#89 (comment)

If I combine this with kube-state-metrics, then I have what is needed.
Now only thing that would be nice to add is how to get the node hostnames using kubernetes_sd_configs.

  1. Could it support client-key-data instead of token?
  2. Should the token be generated by a separate kubectl command?

Current configuration is:
Installed https://github.com/kubernetes/kube-state-metrics on cluster and exposed it on NodePort service (our clusters are not reachable from outside),could also use kubectl proxy from the prometheus machine)

  - job_name: kubernetes-metrics
    static_configs:
      - targets: ['kube-master-1.example:8080']
      - targets: ['kube-master-1.internal:8080']
  - job_name: kubernetes-cadvisor
    metrics_path: "/metrics/cadvisor"
    static_configs:
      - targets: ['kube-master-1.example:10255']
      - targets: ['kube-master-1.internal:10255']
      - targets: ['kube-worker-1.example:10255']
      - targets: ['kube-worker-2.example:10255']
      - targets: ['kube-worker-1.internal:10255']
      - targets: ['kube-worker-2.internal:10255']

It would be nice to use kubernetes_sd_configs to get the cadvisor nodes.

but the solution can't use kubernetes_sd_configs, and can't service dicover . if you want to add new node to cluster, you must config new target (user static_configs) again.

@FUSAKLA

This comment has been minimized.

Copy link
Contributor

FUSAKLA commented Nov 2, 2018

You can set up k8s SD for nodes and using relabeling access the cAdvisor data via kubernetes API proxy.

- job_name: kubernetes-cadvisor
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
  - api_server: <URL to you k8s API>
    role: node
    tls_config:
      ca_file: ca.pem
      cert_file: cert.pem
      key_file: kay.pem
      insecure_skip_verify: false
  tls_config:
    ca_file: ca.pem
    cert_file: cert.pem
    key_file: kay.pem
    insecure_skip_verify: false
  relabel_configs:
  - separator: ;
    regex: __meta_kubernetes_node_label_(.+)
    replacement: $1
    action: labelmap
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: <URL to you k8s API>
    action: replace
  - source_labels: [__meta_kubernetes_node_name]
    separator: ;
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    action: replace
@AttwellBrian

This comment has been minimized.

Copy link

AttwellBrian commented Dec 14, 2018

@FUSAKLA how do you create these ca.perm, cert.pem and kay.pem files? Are the first and second tls_configs the same?

@fanyanming2016

This comment has been minimized.

Copy link

fanyanming2016 commented Jan 8, 2019

@AttwellBrian Are the first and second tls_configs the same? they are same

@jenciso

This comment has been minimized.

Copy link

jenciso commented Mar 12, 2019

You can set up k8s SD for nodes and using relabeling access the cAdvisor data via kubernetes API proxy.

- job_name: kubernetes-cadvisor
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
  - api_server: <URL to you k8s API>
    role: node
    tls_config:
      ca_file: ca.pem
      cert_file: cert.pem
      key_file: kay.pem
      insecure_skip_verify: false
  tls_config:
    ca_file: ca.pem
    cert_file: cert.pem
    key_file: kay.pem
    insecure_skip_verify: false
  relabel_configs:
  - separator: ;
    regex: __meta_kubernetes_node_label_(.+)
    replacement: $1
    action: labelmap
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: <URL to you k8s API>
    action: replace
  - source_labels: [__meta_kubernetes_node_name]
    separator: ;
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    action: replace

For me, until the kubernetes version 1.10 I used that approach, but in the version 1.13, Kubelet doesn't permit authorization correctly. My kubelet log show me

Forbidden (user=kubernetes, verb=get, resource=nodes, subresource=metrics)

I think that it is a forward credential problem. Probably I need to use --requestheader-username-headers=X-Remote-User in the request to kubernetes API (using the proxy approach. I don't have clear this problem, if anybody could be help me, I will appreciate it
Thanks!

@captn3m0

This comment has been minimized.

Copy link

captn3m0 commented Mar 13, 2019

You can set up k8s SD for nodes and using relabeling access the cAdvisor data via kubernetes API proxy.

If anyone here is doing this: how is the load on api server as a result? I'd rather not have my control plane go down because some metrics were scraped too aggressively.

@RahulArora31

This comment has been minimized.

Copy link

RahulArora31 commented Apr 4, 2019

I tried the solutions given in the above comment but failed to access it.
I am using the following config :-

- job_name: kubernetes-cadvisor
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
    - api_server: https://64.102.188.250:6443
      role: node
      tls_config:
        ca_file: /etc/prometheus/secrets/cluster-external/ca.pem
        cert_file: /etc/prometheus/secrets/cluster-external/cert.pem
        key_file: /etc/prometheus/secrets/cluster-external/key.pem
        insecure_skip_verify: false
  tls_config:
    ca_file: /etc/prometheus/secrets/cluster-external/ca.pem
    cert_file: /etc/prometheus/secrets/cluster-external/cert.pem
    key_file: /etc/prometheus/secrets/cluster-external/key.pem
    insecure_skip_verify: false
  relabel_configs:
    - separator: ;
      regex: __meta_kubernetes_node_label_(.+)
      replacement: $1
      action: labelmap
    - separator: ;
      regex: (.*)
      target_label: __address__
      replacement: https://64.102.188.250:6443
      action: replace
    - source_labels: [__meta_kubernetes_node_name]
      separator: ;
      regex: (.+)
      target_label: __metrics_path__
      replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
      action: replace

Help is appreciated.

@RahulArora31

This comment has been minimized.

Copy link

RahulArora31 commented Apr 4, 2019

I tried the solutions given in the above comment but failed to access it.
I am using the following config :-

- job_name: kubernetes-cadvisor
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
    - api_server: https://64.102.188.250:6443
      role: node
      tls_config:
        ca_file: /etc/prometheus/secrets/cluster-external/ca.pem
        cert_file: /etc/prometheus/secrets/cluster-external/cert.pem
        key_file: /etc/prometheus/secrets/cluster-external/key.pem
        insecure_skip_verify: false
  tls_config:
    ca_file: /etc/prometheus/secrets/cluster-external/ca.pem
    cert_file: /etc/prometheus/secrets/cluster-external/cert.pem
    key_file: /etc/prometheus/secrets/cluster-external/key.pem
    insecure_skip_verify: false
  relabel_configs:
    - separator: ;
      regex: __meta_kubernetes_node_label_(.+)
      replacement: $1
      action: labelmap
    - separator: ;
      regex: (.*)
      target_label: __address__
      replacement: https://64.102.188.250:6443
      action: replace
    - source_labels: [__meta_kubernetes_node_name]
      separator: ;
      regex: (.+)
      target_label: __metrics_path__
      replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
      action: replace

Help is appreciated.

I used to the keys given in .kube file to generate ca.pem, cert.pem and key.pem
and since i am using a helm chart, used the secret section which by defaults create a keys in /etc/prometheus/secrets directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.