Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring Kubernetes using Prometheus #1911

Closed
aktarali opened this Issue Aug 23, 2016 · 6 comments

Comments

Projects
None yet
3 participants
@aktarali
Copy link

aktarali commented Aug 23, 2016

Hi;

I am running a Prometheus instance external to the Kubernetes cluster. I am trying to use this instance to scrape Kubernetes cluster. Here is my prometheus.yaml file:

global:
  scrape_interval: "15s"     # By default, scrape targets every 15 seconds.
  evaluation_interval: "15s" # By default, evaluate rules every 15 seconds.

scrape_configs:
- job_name: "prometheus"
  scrape_interval: "15s"
  metrics_path: "/metrics"
  target_groups:
    - targets: ["localhost:9090"]
    - targets: ["192.168.1.31:9100"]
    - targets: ["node6.crmld.com:9110"]

#Kubernetes
- job_name: "K8s Cadvisor"
 scrape_interval: "15s"
 metrics_path: "/metrics"
 target_groups:
   - targets: ["http://node6.crml.com:4001"] #Cadvisor

- job_name: "K8s APIserver"
 scrape_interval: "15s"
 metrics_path: "/metrics"
 target_groups:
   - targets: ["http://node6.crml.com:8888"] #APIserver

- job_name: "K8s Kublet"
  kubernetes_sd_configs:
    - api_servers: 'http://node6.crml.com:8888'
  relabel_configs:
    - source_labels: ["__meta_kubernetes_role"]
      action: "keep"
      regex: "node"

I keep getting the following errors:

Aug 19 17:15:41 node1 prometheus: time="2016-08-19T17:15:41+01:00" level=error msg="Couldn't load configuration (-config.file=/opt/prometheus/prometheus.yml): \"http://node6.crmld.com:4001/metrics\" is not a valid hostname" file=main.go line=177
Aug 19 17:15:41 node1 prometheus: time="2016-08-19T17:15:41+01:00" level=error msg="Note: The configuration format has changed with version 0.14. Please see the documentation (http://prometheus.io/docs/operating/configuration/) and the provided configuration migration tool (https://github.com/prometheus/migrate)." file=main.go line=178

However, when I use my browser to go to http://http://node6.crmld.com:4001/metrics I can see all the metrics.

Can someone please help, its probably something quite simple?

Thanks
Ali.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Aug 23, 2016

Per the error emssage you've an invalid configuration. Remove the http:// from the targets.

@aktarali

This comment has been minimized.

Copy link
Author

aktarali commented Aug 23, 2016

Brian, thank you for pointing that out...I was using the Link here as guide
to config..
https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml

Thanks
Ali.

On 23 Aug 2016 16:14, "Brian Brazil" notifications@github.com wrote:

Per the error emssage you've an invalid configuration. Remove the http://
from the targets.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1911 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABZus8mUVA8wOKp0OMigxN3K2AoUbT0Rks5qiw5WgaJpZM4JrA_v
.

@aktarali

This comment has been minimized.

Copy link
Author

aktarali commented Aug 24, 2016

Hi;

I can now see the output from the API. A further question from me if you can assist as I am new to prom..from a nagios background.

Using my external prom server, I am using the example prometheus-kubernetes.yml file from the repo: https://github.com/prometheus/prometheus/blob/v1.0.1/documentation/examples/prometheus-kubernetes.yml

I am getting this error:
ERROR: Aug 24 11:12:42 node1 prometheus: time="2016-08-24T11:12:42+01:00" level=info msg="Loading configuration file /opt/prometheus/prometheus.yml" file=main.go line=173
Aug 24 11:12:42 node1 prometheus: time="2016-08-24T11:12:42+01:00" level=error msg="Couldn't load configuration (-config.file=/opt/prometheus/prometheus.yml): unknown relabel action "labelmap"" file=main.go line=177

Maybe I'm mix and matching a prom running inside k8s and my one running externally?

PROMETHEUS.YML

# Global default settings.
global:
  scrape_interval: "15s"     # By default, scrape targets every 15 seconds.
  evaluation_interval: "15s" # By default, evaluate rules every 15 seconds.

scrape_configs:
- job_name: "mesos"
  scrape_interval: "15s"
  metrics_path: "/metrics"
  target_groups:
    - targets: ["localhost:9090"]
    - targets: ["192.168.1.31:9100"]
    - targets: ["192.168.1.35:9110"]

#Kubernetes
- job_name: "k8s-Cadvisor"
  scrape_interval: "15s"
  metrics_path: "/metrics"
  target_groups:
    - targets: ["192.168.1.35:4001"] #Cadvisor

- job_name: "k8s-APIserver"
  scrape_interval: "15s"
  metrics_path: "/metrics"
  target_groups:
    - targets: ["192.168.1.35:8888"] #APIserver

scrape_configs:
- job_name: 'kubernetes-cluster'

  # Default to scraping over https. If required, just disable this or change to
  # `http`.
  scheme: http

  kubernetes_sd_configs:
  - api_servers:
    - '192.168.1.35:8888'
    in_cluster: true

  relabel_configs:
  - source_labels: [__meta_kubernetes_role]
    action: keep
    regex: (?:apiserver|node)
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - source_labels: [__meta_kubernetes_role]
    action: replace
    target_label: kubernetes_role

# Scrape config for service endpoints.
#
# The relabeling allows the actual service scrape endpoint to be configured
# via the following annotations:
#
# * `prometheus.io/scrape`: Only scrape services that have a value of `true`
# * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need
# to set this to `https` & most likely set the `tls_config` of the scrape config.
# * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
# * `prometheus.io/port`: If the metrics are exposed on a different port to the
# service then set this appropriately.
- job_name: 'kubernetes-service-endpoints'

  kubernetes_sd_configs:
  - api_servers:
    - '192.168.1.35:8888'
    in_cluster: true

  relabel_configs:
  - source_labels: [__meta_kubernetes_role, __meta_kubernetes_service_annotation_prometheus_io_scrape]
    action: keep
    regex: endpoint;true
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
    action: replace
    target_label: __scheme__
    regex: (https?)
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
    action: replace
    target_label: __address__
    regex: (.+)(?::\d+);(\d+)
    replacement: $1:$2
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_role]
    action: replace
    target_label: kubernetes_role
  - source_labels: [__meta_kubernetes_service_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    action: replace
    target_label: kubernetes_name

# Example scrape config for probing services via the Blackbox Exporter.
#
# The relabeling allows the actual service scrape endpoint to be configured
# via the following annotations:
#
# * `prometheus.io/probe`: Only probe services that have a value of `true`
- job_name: 'kubernetes-services'

  metrics_path: /probe
  params:
    module: [http_2xx]

  kubernetes_sd_configs:
  - api_servers:
    - '192.168.1.35:8888'
    in_cluster: true

  relabel_configs:
  - source_labels: [__meta_kubernetes_role, __meta_kubernetes_service_annotation_prometheus_io_probe]
    action: keep
    regex: service;true
  - source_labels: [__address__]
    target_label: __param_target
  - target_label: __address__
    replacement: blackbox
  - source_labels: [__param_target]
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_role]
    target_label: kubernetes_role
  - source_labels: [__meta_kubernetes_service_namespace]
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    target_label: kubernetes_name

# Example scrape config for pods
#
# The relabeling allows the actual pod scrape endpoint to be configured via the 
# following annotations:
#
# * `prometheus.io/scrape`: Only scrape pods that have a value of `true`
# * `prometheus.io/port`: Scrape the pod on the indicated port instead of the default of `9102`.
- job_name: 'kubernetes-pods'

  kubernetes_sd_configs:
  - api_servers:
    - '192.168.1.35:8888'
    in_cluster: true

  relabel_configs:
  - source_labels: [__meta_kubernetes_role, __meta_kubernetes_pod_annotation_prometheus_io_scrape]
    action: keep
    regex: pod;true
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    regex: (.+):(?:\d+);(\d+)
    replacement: ${1}:${2}
    target_label: __address__
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - source_labels: [__meta_kubernetes_pod_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: kubernetes_pod_name
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Aug 24, 2016

Make sure you're running a recent version of Prometheus, labelmap was added quite some time ago.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Nov 12, 2016

Closing as this has gone stale.

@fabxc fabxc closed this Nov 12, 2016

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.