Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KubeApi servers are in "state unknown" mostly #2543

Closed
cemo opened this Issue Mar 28, 2017 · 36 comments

Comments

Projects
None yet
8 participants
@cemo
Copy link

cemo commented Mar 28, 2017

What did you do?

I am trying to get used prometheus with k8s and I have successfully run some exporters. However I have some problems for API server. In last scrape I see almost everytime never for API servers:

image

I also saw very rarely some successful scraps too:

image

What did you expect to see?
always UP
What did you see instead? Under which circumstances?
UNKNOWN
Environment
k8s 1.5.2
HA 3 master
CoreOS

  • Prometheus version:
    1.5.2

  • Prometheus configuration file:

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus
  namespace: pro
data:
  alerts: ""
  rules: ""
  prometheus.yml: |-
    rule_files:
      - /etc/config/rules
      - /etc/config/alerts

    scrape_configs:
      - job_name: prometheus
        static_configs:
          - targets:
            - localhost:9090

      # A scrape configuration for running Prometheus on a Kubernetes cluster.
      # This uses separate scrape configs for cluster components (i.e. API server, node)
      # and services to allow each to use different authentication configs.
      #
      # Kubernetes labels will be added as Prometheus labels on metrics via the
      # `labelmap` relabeling action.

      # Scrape config for API servers.
      #
      # Kubernetes exposes API servers as endpoints to the default/kubernetes
      # service so this uses `endpoints` role and uses relabelling to only keep
      # the endpoints associated with the default/kubernetes service using the
      # default named port `https`. This works for single API server deployments as
      # well as HA API server deployments.
      - job_name: 'kubernetes-apiservers'

        kubernetes_sd_configs:
          - role: endpoints

        # Default to scraping over https. If required, just disable this or change to
        # `http`.
        scheme: https

        # This TLS & bearer token file config is used to connect to the actual scrape
        # endpoints for cluster components. This is separate to discovery auth
        # configuration because discovery & scraping are two separate concerns in
        # Prometheus. The discovery auth config is automatic if Prometheus runs inside
        # the cluster. Otherwise, more config options have to be provided within the
        # <kubernetes_sd_config>.
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          # If your node certificates are self-signed or use a different CA to the
          # master CA, then disable certificate verification below. Note that
          # certificate verification is an integral part of a secure infrastructure
          # so this should only be disabled in a controlled environment. You can
          # disable certificate verification by uncommenting the line below.
          #
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        # Keep only the default/kubernetes service endpoints for the https port. This
        # will add targets for each API server which Kubernetes adds an endpoint to
        # the default/kubernetes service.
        relabel_configs:
          - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
            action: keep
            regex: default;kubernetes;https

      - job_name: 'kubernetes-nodes'

        # Default to scraping over https. If required, just disable this or change to
        # `http`.
        scheme: https

        # This TLS & bearer token file config is used to connect to the actual scrape
        # endpoints for cluster components. This is separate to discovery auth
        # configuration because discovery & scraping are two separate concerns in
        # Prometheus. The discovery auth config is automatic if Prometheus runs inside
        # the cluster. Otherwise, more config options have to be provided within the
        # <kubernetes_sd_config>.
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          # If your node certificates are self-signed or use a different CA to the
          # master CA, then disable certificate verification below. Note that
          # certificate verification is an integral part of a secure infrastructure
          # so this should only be disabled in a controlled environment. You can
          # disable certificate verification by uncommenting the line below.
          #
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        kubernetes_sd_configs:
          - role: node

        relabel_configs:
          - action: labelmap
            regex: __meta_kubernetes_node_label_(.+)

      # Scrape config for service endpoints.
      #
      # The relabeling allows the actual service scrape endpoint to be configured
      # via the following annotations:
      #
      # * `prometheus.io/scrape`: Only scrape services that have a value of `true`
      # * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need
      # to set this to `https` & most likely set the `tls_config` of the scrape config.
      # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
      # * `prometheus.io/port`: If the metrics are exposed on a different port to the
      # service then set this appropriately.
      - job_name: 'kubernetes-service-endpoints'

        kubernetes_sd_configs:
          - role: endpoints

        relabel_configs:
          - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
            action: replace
            target_label: __scheme__
            regex: (https?)
          - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
            action: replace
            target_label: __address__
            regex: (.+)(?::\d+);(\d+)
            replacement: $1:$2
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: kubernetes_namespace
          - source_labels: [__meta_kubernetes_service_name]
            action: replace
            target_label: kubernetes_name

      # Example scrape config for probing services via the Blackbox Exporter.
      #
      # The relabeling allows the actual service scrape endpoint to be configured
      # via the following annotations:
      #
      # * `prometheus.io/probe`: Only probe services that have a value of `true`
      - job_name: 'kubernetes-services'

        metrics_path: /probe
        params:
          module: [http_2xx]

        kubernetes_sd_configs:
          - role: service

        relabel_configs:
          - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
            action: keep
            regex: true
          - source_labels: [__address__]
            target_label: __param_target
          - target_label: __address__
            replacement: blackbox
          - source_labels: [__param_target]
            target_label: instance
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            target_label: kubernetes_namespace
          - source_labels: [__meta_kubernetes_service_name]
            target_label: kubernetes_name

      # Example scrape config for pods
      #
      # The relabeling allows the actual pod scrape endpoint to be configured via the
      # following annotations:
      #
      # * `prometheus.io/scrape`: Only scrape pods that have a value of `true`
      # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
      # * `prometheus.io/port`: Scrape the pod on the indicated port instead of the default of `9102`.
      - job_name: 'kubernetes-pods'

        kubernetes_sd_configs:
          - role: pod

        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            action: replace
            regex: (.+):(?:\d+);(\d+)
            replacement: ${1}:${2}
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: kubernetes_namespace
          - source_labels: [__meta_kubernetes_pod_name]
            action: replace
            target_label: kubernetes_pod_name

Any reason why?

@gouthamve

This comment has been minimized.

Copy link
Member

gouthamve commented Mar 28, 2017

Logs would help here. Do you see anything out of place there?

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Mar 29, 2017

The only thing I can get with my limited knowledge is:

kubectl logs prometheus-1138844003-g8s9t -c prometheus -n pro

time="2017-03-29T07:22:41Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.840042089s." source="persistence.go:639"
time="2017-03-29T07:27:41Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:27:44Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.89699028s." source="persistence.go:639"
time="2017-03-29T07:32:44Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:32:47Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.924940511s." source="persistence.go:639"
time="2017-03-29T07:37:47Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:37:50Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.909455319s." source="persistence.go:639"

Is there somewhere else I need to check for logging?

@gouthamve

This comment has been minimized.

Copy link
Member

gouthamve commented Mar 29, 2017

Hmm, this could happen if the storage is throttled and stops ingesting metrics. But the logs will clearly mention this.

These logs are the normal ones. Can you pastebin all the logs since the beginning? I am not sure about their usefulness though as the checkpoint time is too low for any significant amount of metrics.

What is the current RAM usage of the server and did you put an upper limit on the RAM?

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Mar 29, 2017

$ kubectl logs prometheus-1138844003-g8s9t -c prometheus -n pro
time="2017-03-28T21:18:10Z" level=info msg="Starting prometheus (version=1.5.2, branch=master, revision=bd1182d29f462c39544f94cc822830e1c64cf55b)" source="main.go:75"
time="2017-03-28T21:18:10Z" level=info msg="Build context (go=go1.7.5, user=root@1a01c5f68840, date=20170210-16:23:28)" source="main.go:76"
time="2017-03-28T21:18:10Z" level=info msg="Loading configuration file /etc/config/prometheus.yml" source="main.go:248"
time="2017-03-28T21:18:10Z" level=info msg="Loading series map and head chunks..." source="storage.go:373"
time="2017-03-28T21:18:10Z" level=info msg="0 series loaded." source="storage.go:378"
time="2017-03-28T21:18:10Z" level=info msg="Listening on :9090" source="web.go:259"
time="2017-03-28T21:18:10Z" level=info msg="Starting target manager..." source="targetmanager.go:61"
time="2017-03-28T21:18:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-03-28T21:18:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-03-28T21:18:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-03-28T21:18:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-03-28T21:18:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-03-28T21:23:10Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T21:23:12Z" level=info msg="Done checkpointing in-memory metrics and chunks in 1.947298058s." source="persistence.go:639"
time="2017-03-28T21:28:12Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T21:28:15Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.063432051s." source="persistence.go:639"
time="2017-03-28T21:33:15Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T21:33:17Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.020245362s." source="persistence.go:639"
time="2017-03-28T21:38:17Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T21:38:18Z" level=info msg="Done checkpointing in-memory metrics and chunks in 1.94417362s." source="persistence.go:639"
time="2017-03-28T21:40:38Z" level=info msg="Completed maintenance sweep through 6656 in-memory fingerprints in 22m17.483572788s." source="storage.go:1202"
time="2017-03-28T21:43:18Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T21:43:20Z" level=info msg="Done checkpointing in-memory metrics and chunks in 1.996253921s." source="persistence.go:639"
time="2017-03-28T21:48:20Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T21:48:22Z" level=info msg="Done checkpointing in-memory metrics and chunks in 1.945595287s." source="persistence.go:639"
time="2017-03-28T21:53:22Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T21:53:24Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.069031572s." source="persistence.go:639"
time="2017-03-28T21:58:24Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T21:58:27Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.132658576s." source="persistence.go:639"
time="2017-03-28T22:03:27Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:03:29Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.174847804s." source="persistence.go:639"
time="2017-03-28T22:08:29Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:08:31Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.194531367s." source="persistence.go:639"
time="2017-03-28T22:13:31Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:13:33Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.134557247s." source="persistence.go:639"
time="2017-03-28T22:18:33Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:18:35Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.281864405s." source="persistence.go:639"
time="2017-03-28T22:23:35Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:23:38Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.15006515s." source="persistence.go:639"
time="2017-03-28T22:28:38Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:28:40Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.121287436s." source="persistence.go:639"
time="2017-03-28T22:33:40Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:33:42Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.125898218s." source="persistence.go:639"
time="2017-03-28T22:38:42Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:38:44Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.103883954s." source="persistence.go:639"
time="2017-03-28T22:43:44Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:43:46Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.152552976s." source="persistence.go:639"
time="2017-03-28T22:48:46Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:48:48Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.180000399s." source="persistence.go:639"
time="2017-03-28T22:53:48Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:53:50Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.155602149s." source="persistence.go:639"
time="2017-03-28T22:58:50Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T22:58:53Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.171594613s." source="persistence.go:639"
time="2017-03-28T23:03:53Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:03:55Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.189984848s." source="persistence.go:639"
time="2017-03-28T23:08:55Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:08:57Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.168271777s." source="persistence.go:639"
time="2017-03-28T23:13:57Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:13:59Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.203818649s." source="persistence.go:639"
time="2017-03-28T23:18:59Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:19:01Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.16672311s." source="persistence.go:639"
time="2017-03-28T23:24:01Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:24:03Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.110026509s." source="persistence.go:639"
time="2017-03-28T23:29:03Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:29:06Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.203888767s." source="persistence.go:639"
time="2017-03-28T23:34:06Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:34:08Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.084834932s." source="persistence.go:639"
time="2017-03-28T23:39:08Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:39:10Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.077391799s." source="persistence.go:639"
time="2017-03-28T23:44:10Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:44:12Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.129028716s." source="persistence.go:639"
time="2017-03-28T23:49:12Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:49:14Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.103865145s." source="persistence.go:639"
time="2017-03-28T23:54:14Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:54:16Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.134820232s." source="persistence.go:639"
time="2017-03-28T23:59:16Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-28T23:59:18Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.074333729s." source="persistence.go:639"
time="2017-03-29T00:04:18Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:04:20Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.101754339s." source="persistence.go:639"
time="2017-03-29T00:09:20Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:09:22Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.098581245s." source="persistence.go:639"
time="2017-03-29T00:14:22Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:14:24Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.070203996s." source="persistence.go:639"
time="2017-03-29T00:19:24Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:19:27Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.173172991s." source="persistence.go:639"
time="2017-03-29T00:24:27Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:24:29Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.027086494s." source="persistence.go:639"
time="2017-03-29T00:29:29Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:29:31Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.112855648s." source="persistence.go:639"
time="2017-03-29T00:34:31Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:34:33Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.163021076s." source="persistence.go:639"
time="2017-03-29T00:39:33Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:39:35Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.154405532s." source="persistence.go:639"
time="2017-03-29T00:44:35Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:44:37Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.127644354s." source="persistence.go:639"
time="2017-03-29T00:49:37Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:49:39Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.178025615s." source="persistence.go:639"
time="2017-03-29T00:54:39Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:54:42Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.135971687s." source="persistence.go:639"
time="2017-03-29T00:59:42Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T00:59:44Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.106730189s." source="persistence.go:639"
time="2017-03-29T01:04:44Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:04:46Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.166004847s." source="persistence.go:639"
time="2017-03-29T01:09:46Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:09:48Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.129290341s." source="persistence.go:639"
time="2017-03-29T01:14:48Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:14:50Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.095793012s." source="persistence.go:639"
time="2017-03-29T01:19:50Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:19:52Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.077346125s." source="persistence.go:639"
time="2017-03-29T01:24:52Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:24:54Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.117504026s." source="persistence.go:639"
time="2017-03-29T01:29:54Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:29:56Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.080455932s." source="persistence.go:639"
time="2017-03-29T01:34:56Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:34:59Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.195372562s." source="persistence.go:639"
time="2017-03-29T01:39:59Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:40:01Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.155763163s." source="persistence.go:639"
time="2017-03-29T01:45:01Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:45:03Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.149020927s." source="persistence.go:639"
time="2017-03-29T01:50:03Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:50:05Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.142842887s." source="persistence.go:639"
time="2017-03-29T01:55:05Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T01:55:07Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.158683852s." source="persistence.go:639"
time="2017-03-29T02:00:07Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:00:09Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.223527515s." source="persistence.go:639"
time="2017-03-29T02:05:09Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:05:12Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.158940471s." source="persistence.go:639"
time="2017-03-29T02:10:12Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:10:14Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.167085849s." source="persistence.go:639"
time="2017-03-29T02:15:14Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:15:16Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.138890599s." source="persistence.go:639"
time="2017-03-29T02:20:16Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:20:18Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.284314197s." source="persistence.go:639"
time="2017-03-29T02:25:18Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:25:20Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.136264622s." source="persistence.go:639"
time="2017-03-29T02:30:20Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:30:22Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.178892934s." source="persistence.go:639"
time="2017-03-29T02:35:22Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:35:25Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.175179841s." source="persistence.go:639"
time="2017-03-29T02:40:25Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:40:27Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.224432194s." source="persistence.go:639"
time="2017-03-29T02:45:27Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:45:29Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.185282855s." source="persistence.go:639"
time="2017-03-29T02:50:29Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:50:31Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.168934895s." source="persistence.go:639"
time="2017-03-29T02:55:31Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T02:55:33Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.185452963s." source="persistence.go:639"
time="2017-03-29T03:00:33Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:00:36Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.162964093s." source="persistence.go:639"
time="2017-03-29T03:05:36Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:05:38Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.199589472s." source="persistence.go:639"
time="2017-03-29T03:10:38Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:10:40Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.213859147s." source="persistence.go:639"
time="2017-03-29T03:15:40Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:15:42Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.177868246s." source="persistence.go:639"
time="2017-03-29T03:20:42Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:20:44Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.167294s." source="persistence.go:639"
time="2017-03-29T03:25:44Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:25:47Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.222647328s." source="persistence.go:639"
time="2017-03-29T03:30:47Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:30:49Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.195261145s." source="persistence.go:639"
time="2017-03-29T03:35:49Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:35:51Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.082354731s." source="persistence.go:639"
time="2017-03-29T03:36:04Z" level=info msg="Completed maintenance sweep through 119749 in-memory fingerprints in 5h55m25.439373855s." source="storage.go:1202"
time="2017-03-29T03:40:51Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:40:53Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.108177475s." source="persistence.go:639"
time="2017-03-29T03:45:53Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:45:55Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.190564774s." source="persistence.go:639"
time="2017-03-29T03:50:55Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:50:57Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.147310391s." source="persistence.go:639"
time="2017-03-29T03:55:57Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T03:55:59Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.219360848s." source="persistence.go:639"
time="2017-03-29T04:00:59Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:01:02Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.188674078s." source="persistence.go:639"
time="2017-03-29T04:06:02Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:06:04Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.184976228s." source="persistence.go:639"
time="2017-03-29T04:11:04Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:11:06Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.195537847s." source="persistence.go:639"
time="2017-03-29T04:16:06Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:16:08Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.094670302s." source="persistence.go:639"
time="2017-03-29T04:21:08Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:21:10Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.150533958s." source="persistence.go:639"
time="2017-03-29T04:26:10Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:26:12Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.183708921s." source="persistence.go:639"
time="2017-03-29T04:31:12Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:31:15Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.106066384s." source="persistence.go:639"
time="2017-03-29T04:36:15Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:36:17Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.179812331s." source="persistence.go:639"
time="2017-03-29T04:41:17Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:41:19Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.16562684s." source="persistence.go:639"
time="2017-03-29T04:46:19Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:46:21Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.21904938s." source="persistence.go:639"
time="2017-03-29T04:51:21Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:51:23Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.186333912s." source="persistence.go:639"
time="2017-03-29T04:56:23Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T04:56:26Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.180617213s." source="persistence.go:639"
time="2017-03-29T05:01:26Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:01:28Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.20416286s." source="persistence.go:639"
time="2017-03-29T05:06:28Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:06:30Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.212323157s." source="persistence.go:639"
time="2017-03-29T05:11:30Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:11:32Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.167725328s." source="persistence.go:639"
time="2017-03-29T05:16:32Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:16:34Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.326858221s." source="persistence.go:639"
time="2017-03-29T05:21:34Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:21:37Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.366945423s." source="persistence.go:639"
time="2017-03-29T05:26:37Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:26:39Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.359808211s." source="persistence.go:639"
time="2017-03-29T05:31:39Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:31:41Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.292580367s." source="persistence.go:639"
time="2017-03-29T05:36:41Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:36:44Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.631082934s." source="persistence.go:639"
time="2017-03-29T05:41:44Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:41:47Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.629732183s." source="persistence.go:639"
time="2017-03-29T05:46:47Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:46:49Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.596692636s." source="persistence.go:639"
time="2017-03-29T05:51:49Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:51:52Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.631457165s." source="persistence.go:639"
time="2017-03-29T05:56:52Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T05:56:55Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.620552724s." source="persistence.go:639"
time="2017-03-29T06:01:55Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:01:57Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.611655011s." source="persistence.go:639"
time="2017-03-29T06:06:57Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:07:00Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.545262675s." source="persistence.go:639"
time="2017-03-29T06:12:00Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:12:02Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.687114177s." source="persistence.go:639"
time="2017-03-29T06:17:02Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:17:05Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.716099595s." source="persistence.go:639"
time="2017-03-29T06:22:05Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:22:08Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.692639675s." source="persistence.go:639"
time="2017-03-29T06:27:08Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:27:11Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.724250612s." source="persistence.go:639"
time="2017-03-29T06:32:11Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:32:13Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.662057912s." source="persistence.go:639"
time="2017-03-29T06:37:13Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:37:16Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.673414898s." source="persistence.go:639"
time="2017-03-29T06:42:16Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:42:19Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.666047009s." source="persistence.go:639"
time="2017-03-29T06:47:19Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:47:21Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.66447511s." source="persistence.go:639"
time="2017-03-29T06:52:21Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:52:24Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.723680017s." source="persistence.go:639"
time="2017-03-29T06:57:24Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T06:57:27Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.738173584s." source="persistence.go:639"
time="2017-03-29T07:02:27Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:02:29Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.722624622s." source="persistence.go:639"
time="2017-03-29T07:07:29Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:07:32Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.797205997s." source="persistence.go:639"
time="2017-03-29T07:12:32Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:12:35Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.840214069s." source="persistence.go:639"
time="2017-03-29T07:17:35Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:17:38Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.924379475s." source="persistence.go:639"
time="2017-03-29T07:22:38Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:22:41Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.840042089s." source="persistence.go:639"
time="2017-03-29T07:27:41Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:27:44Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.89699028s." source="persistence.go:639"
time="2017-03-29T07:32:44Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:32:47Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.924940511s." source="persistence.go:639"
time="2017-03-29T07:37:47Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:37:50Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.909455319s." source="persistence.go:639"
time="2017-03-29T07:42:50Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:42:52Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.885404859s." source="persistence.go:639"
time="2017-03-29T07:47:52Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:47:55Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.873869112s." source="persistence.go:639"
time="2017-03-29T07:52:55Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:52:58Z" level=info msg="Done checkpointing in-memory metrics and chunks in 2.995490786s." source="persistence.go:639"
time="2017-03-29T07:57:58Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T07:58:01Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.012474598s." source="persistence.go:639"
time="2017-03-29T08:03:01Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:03:04Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.032535229s." source="persistence.go:639"
time="2017-03-29T08:08:04Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:08:07Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.050124176s." source="persistence.go:639"
time="2017-03-29T08:13:07Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:13:11Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.276213166s." source="persistence.go:639"
time="2017-03-29T08:18:11Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:18:14Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.201229225s." source="persistence.go:639"
time="2017-03-29T08:23:14Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:23:17Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.139763001s." source="persistence.go:639"
time="2017-03-29T08:28:17Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:28:20Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.091064291s." source="persistence.go:639"
time="2017-03-29T08:33:20Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:33:23Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.116318214s." source="persistence.go:639"
time="2017-03-29T08:38:23Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:38:26Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.110167751s." source="persistence.go:639"
time="2017-03-29T08:43:26Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:43:29Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.181435106s." source="persistence.go:639"
time="2017-03-29T08:48:29Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:48:33Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.850886485s." source="persistence.go:639"
time="2017-03-29T08:53:33Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:53:37Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.295919792s." source="persistence.go:639"
time="2017-03-29T08:58:37Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T08:58:40Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.311073488s." source="persistence.go:639"
time="2017-03-29T09:03:40Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:03:43Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.225262855s." source="persistence.go:639"
time="2017-03-29T09:08:43Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:08:46Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.29216128s." source="persistence.go:639"
time="2017-03-29T09:13:46Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:13:50Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.166343399s." source="persistence.go:639"
time="2017-03-29T09:15:08Z" level=info msg="Completed maintenance sweep through 133379 in-memory fingerprints in 5h39m3.706636504s." source="storage.go:1202"
time="2017-03-29T09:18:50Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:18:53Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.228488829s." source="persistence.go:639"
time="2017-03-29T09:23:53Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:23:56Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.279304331s." source="persistence.go:639"
time="2017-03-29T09:28:56Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:28:59Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.251458568s." source="persistence.go:639"
time="2017-03-29T09:33:59Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:34:03Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.257042274s." source="persistence.go:639"
time="2017-03-29T09:39:03Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:39:06Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.270150151s." source="persistence.go:639"
time="2017-03-29T09:44:06Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:44:09Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.223586448s." source="persistence.go:639"
time="2017-03-29T09:49:09Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:49:12Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.234589126s." source="persistence.go:639"
time="2017-03-29T09:54:12Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:54:16Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.185793066s." source="persistence.go:639"
time="2017-03-29T09:59:16Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T09:59:19Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.216427687s." source="persistence.go:639"
time="2017-03-29T10:04:19Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:04:22Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.198059201s." source="persistence.go:639"
time="2017-03-29T10:09:22Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:09:25Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.171202543s." source="persistence.go:639"
time="2017-03-29T10:14:25Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:14:28Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.128792881s." source="persistence.go:639"
time="2017-03-29T10:19:28Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:19:31Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.206109056s." source="persistence.go:639"
time="2017-03-29T10:24:31Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:24:35Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.125848394s." source="persistence.go:639"
time="2017-03-29T10:29:35Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:29:38Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.194976318s." source="persistence.go:639"
time="2017-03-29T10:34:38Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:34:41Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.140438981s." source="persistence.go:639"
time="2017-03-29T10:39:41Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:39:44Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.076725432s." source="persistence.go:639"
time="2017-03-29T10:44:44Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:44:47Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.148846868s." source="persistence.go:639"
time="2017-03-29T10:49:47Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:49:50Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.163650173s." source="persistence.go:639"
time="2017-03-29T10:54:50Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:54:54Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.190767401s." source="persistence.go:639"
time="2017-03-29T10:59:54Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T10:59:57Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.175719138s." source="persistence.go:639"
time="2017-03-29T11:04:57Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T11:05:00Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.121028431s." source="persistence.go:639"
time="2017-03-29T11:10:00Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T11:10:03Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.382786484s." source="persistence.go:639"
time="2017-03-29T11:15:03Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T11:15:07Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.305608605s." source="persistence.go:639"
time="2017-03-29T11:20:07Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:612"
time="2017-03-29T11:20:10Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.250744641s." source="persistence.go:639"
@cemo

This comment has been minimized.

Copy link
Author

cemo commented Mar 29, 2017

I have created a namespace in kubernetes and limited to 2GB Ram. Is this metrics on kubernetes-api server are enabled by default? Do you have any idea?

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Mar 30, 2017

Your logs actually look ok, I'm wondering since you said it's a HA 3 masters cluster, shouldn't there be 3 targets? Can you show us the content of

kubectl get endpoints/kubernetes -oyaml

And

kubectl -n kube-system get pods

I have created a namespace in kubernetes and limited to 2GB Ram. Is this metrics on kubernetes-api server are enabled by default? Do you have any idea?

Yes the metrics are exposed by default.

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Mar 30, 2017

kubectl get endpoints/kubernetes -oyaml

apiVersion: v1
kind: Endpoints
metadata:
  creationTimestamp: 2017-01-31T20:15:59Z
  name: kubernetes
  namespace: default
  resourceVersion: "13810738"
  selfLink: /api/v1/namespaces/default/endpoints/kubernetes
  uid: 14d0674e-e7f2-11e6-8cc5-02767a1572e9
subsets:
- addresses:
  - ip: 10.0.10.12
  ports:
  - name: https
    port: 443
    protocol: TCP

the ip section is changing in multiple requests. This is pretty interesting for me too. I was expecting 3 ip at the same time.

here is the relevant pods part

kube-apiserver-ip-10-0-10-10.eu-central-1.compute.internal            1/1       Running   0          28m
kube-apiserver-ip-10-0-10-11.eu-central-1.compute.internal            1/1       Running   0          23h
kube-apiserver-ip-10-0-10-12.eu-central-1.compute.internal            1/1       Running   0          25m

I had just added some parameters to kube-api server about cronjobs. That is why they have just started.

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Mar 30, 2017

I just learned that this is a known issue, where the apiservers are racing against each other and keep replacing the IP.

/cc @alexsomesan @s-urbaniak

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Mar 30, 2017

Would you think that an update on k8s cluster for latest 1.5.x can help?

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Mar 30, 2017

This issue seems to arise, when the --apiserver-count flag is not set correctly. kubernetes/kubernetes#22609

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Mar 30, 2017

So my issue is actually not related to constantly being updated api endpoints. It seems I have another issue.

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Mar 31, 2017

Can you elaborate what you are seeing?

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Mar 31, 2017

Sorry for not being clear. I thought if kubernetes/kubernetes#22609 is still open, too many cluster must suffer from same problem. But given the fact that there is not so much noise about this issue, not scraping kube api servers must be related to something else. Do you think that my issue is related to kubernetes/kubernetes#22609 ?

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 2, 2017

Gentle ping @brancz

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Apr 3, 2017

Can you provide the flags you use to start the apiserver? Then we should be able to find out quickly.

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 3, 2017

spec:
        hostNetwork: true
        containers:
        - name: kube-apiserver
          image: ${ hyperkube }
          command:
          - /hyperkube
          - apiserver
          - --admission-control=LimitRanger
          - --admission-control=NamespaceExists
          - --admission-control=NamespaceLifecycle
          - --admission-control=ResourceQuota
          - --admission-control=SecurityContextDeny
          - --admission-control=ServiceAccount
          - --allow-privileged=true
          - --client-ca-file=/etc/kubernetes/ssl/ca.pem
          - --cloud-provider=aws
          - --etcd-servers=http://etcd.${ internal-tld }:2379
          - --insecure-bind-address=0.0.0.0
          - --secure-port=443
          - --service-account-key-file=/etc/kubernetes/ssl/k8s-apiserver-key.pem
          - --service-cluster-ip-range=${ service-cluster-ip-range }
          - --tls-cert-file=/etc/kubernetes/ssl/k8s-apiserver.pem
          - --tls-private-key-file=/etc/kubernetes/ssl/k8s-apiserver-key.pem
          - --v=2
          livenessProbe:
            httpGet:
              host: 127.0.0.1
              port: 8080
              path: /healthz
            initialDelaySeconds: 15
            timeoutSeconds: 15
          ports:
          - containerPort: 443
            hostPort: 443
            name: https
          - containerPort: 8080
            hostPort: 8080
            name: local
          volumeMounts:
          - mountPath: /etc/kubernetes/ssl
            name: ssl-certs-kubernetes
            readOnly: true
          - mountPath: /etc/ssl/certs
            name: ssl-certs-host
            readOnly: true
        volumes:
        - hostPath:
            path: /etc/kubernetes/ssl
          name: ssl-certs-kubernetes
        - hostPath:
            path: /usr/share/ca-certificates
          name: ssl-certs-host
@brancz

This comment has been minimized.

Copy link
Member

brancz commented Apr 3, 2017

@alexsomesan @s-urbaniak can either of you comment that what we are seeing could be related to the --apiserver-count flag and/or if that could fix this issue?

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 3, 2017

from my local box I can successfully see metrics

https://kubernetes.io/docs/concepts/cluster-administration/access-cluster/


curl  https://10.0.10.10/metrics --header "Authorization: Bearer $TOKEN" --insecure
apiserver_request_count{client="Go-http-client/1.1",code="200",contentType="application/json",resource="endpoints",verb="GET"} 14658
apiserver_request_count{client="Go-http-client/1.1",code="200",contentType="application/json",resource="endpoints",verb="LIST"} 1
apiserver_request_count{client="Go-http-client/1.1",code="200",contentType="application/json",resource="ingresses",verb="LIST"} 564
apiserver_request_count{client="Go-http-client/1.1",code="200",contentType="application/json",resource="services",verb="GET"} 14665
apiserver_request_count{client="Go-http-client/1.1",code="200",contentType="application/json",resource="services",verb="LIST"} 1
apiserver_request_count{client="Go-http-client/1.1",code="404",contentType="application/json",resource="services",verb="GET"} 1124
apiserver_request_count{client="hyperkube/v1.5.1+coreos.0 (linux/amd64) kubernetes/cc65f53",code="200",contentType="application/json",resource="clusterroles",verb="LIST"} 1
apiserver_request_count{client="hyperkube/v1.5.1+coreos.0 (linux/amd64) kubernetes/cc65f53",code="200",contentType="application/json",resource="endpoints",verb="GET"} 134
apiserver_request_count{client="hyperkube/v1.5.1+coreos.0 (linux/amd64) kubernetes/cc65f53",code="200",contentType="application/json",resource="endpoints",verb="PUT"} 133
apiserver_request_count{client="hyperkube/v1.5.1+coreos.0 (linux/amd64) kubernetes/cc65f53",code="200",contentType="application/json",resource="secrets",verb="LIST"} 1
......
etcd_request_latencies_summary{operation="list",type="*[]api.PodTemplate",quantile="0.5"} 340229
etcd_request_latencies_summary{operation="list",type="*[]api.PodTemplate",quantile="0.9"} 340229
etcd_request_latencies_summary{operation="list",type="*[]api.PodTemplate",quantile="0.99"} 340229
....
@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 3, 2017

tried latest master as well. Not helped.

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 3, 2017

@brancz I think that this is bug on prometheus side. :( Can I debug more verbose?

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 3, 2017

I added -log.level=debug parameter and saw only these messages:

time="2017-04-03T20:40:26Z" level=debug msg="endpoints update" kubernetes_sd=endpoint source="endpoints.go:77" tg="&config.TargetGroup{Targets:[]model.LabelSet{model.LabelSet{\"__address__\":\"10.0.10.11:443\", \"__meta_kubernetes_endpoint_port_name\":\"https\", \"__meta_kubernetes_endpoint_port_protocol\":\"TCP\", \"__meta_kubernetes_endpoint_ready\":\"true\"}}, Labels:model.LabelSet{\"__meta_kubernetes_service_name\":\"kubernetes\", \"__meta_kubernetes_service_label_component\":\"apiserver\", \"__meta_kubernetes_service_label_provider\":\"kubernetes\", \"__meta_kubernetes_namespace\":\"default\", \"__meta_kubernetes_endpoints_name\":\"kubernetes\"}, Source:\"endpoints/default/kubernetes\"}"
time="2017-04-03T20:40:28Z" level=debug msg="endpoints update" kubernetes_sd=endpoint source="endpoints.go:77" tg="&config.TargetGroup{Targets:[]model.LabelSet{model.LabelSet{\"__meta_kubernetes_endpoint_port_protocol\":\"TCP\", \"__meta_kubernetes_endpoint_ready\":\"true\", \"__address__\":\"10.0.10.12:443\", \"__meta_kubernetes_endpoint_port_name\":\"https\"}}, Labels:model.LabelSet{\"__meta_kubernetes_service_label_provider\":\"kubernetes\", \"__meta_kubernetes_namespace\":\"default\", \"__meta_kubernetes_endpoints_name\":\"kubernetes\", \"__meta_kubernetes_service_name\":\"kubernetes\", \"__meta_kubernetes_service_label_component\":\"apiserver\"}, Source:\"endpoints/default/kubernetes\"}"
time="2017-04-03T20:40:29Z" level=debug msg="endpoints update" kubernetes_sd=endpoint source="endpoints.go:77" tg="&config.TargetGroup{Targets:[]model.LabelSet{model.LabelSet{\"__meta_kubernetes_endpoint_ready\":\"true\", \"__address__\":\"10.0.10.10:443\", \"__meta_kubernetes_endpoint_port_name\":\"https\", \"__meta_kubernetes_endpoint_port_protocol\":\"TCP\"}}, Labels:model.LabelSet{\"__meta_kubernetes_endpoints_name\":\"kubernetes\", \"__meta_kubernetes_namespace\":\"default\", \"__meta_kubernetes_service_label_provider\":\"kubernetes\", \"__meta_kubernetes_service_name\":\"kubernetes\", \"__meta_kubernetes_service_label_component\":\"apiserver\"}, Source:\"endpoints/default/kubernetes\"}"
@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 4, 2017

I have also added api_server in kubernetes_sd_configs. Unfortunately did not help. If this issue is related to racing apiservers which is still open, there might be either a workaround for it or this should be problem for all of others.

@brancz Don't you think that this is a bug of prometheus?

@cemo cemo changed the title KubeApi servers are in state unknown mostly KubeApi servers are in `state unknown` mostly Apr 4, 2017

@cemo cemo changed the title KubeApi servers are in `state unknown` mostly KubeApi servers are in "state unknown" mostly Apr 4, 2017

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Apr 4, 2017

As Prometheus takes it from the Endpoints object, which only shows one IP in your setup, and constantly changes, I do not think this is a Prometheus issue. When a target gets added it is in the unknown state until the first scrape, if the IPs constantly get removed/added from the Endpoints object then that causes them to look like new targets for Prometheus thus in unknown state as they have not been scraped since they were discovered.

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 4, 2017

Great explanation. Do you think that a workaround can be provided? Is it possible to skipping discovery and statically scrapping api servers like this?

      - job_name: 'kubernetes-apiservers'

        static_configs:
          - targets:
            - kubernetes.default.svc.cluster.local

        scheme: https

        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: false
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        relabel_configs:
          - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
            action: keep
            regex: default;kubernetes;https

I changed to this but this is not working as well. I don't know where is the configuration mistake right now.

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 4, 2017

      - job_name: 'kubernetes-apiservers'
        static_configs:
          - targets:
            - kubernetes.default.svc.cluster.local
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: false
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

Removed relabel_configs (I am new to prometheus and will check consequences of this removal now.)

This configuration is working right now. I can successfully see OK.

@s-urbaniak

This comment has been minimized.

Copy link

s-urbaniak commented Apr 4, 2017

Hey, we just tried it locally and can reproduce the issue. Without setting --apiserver-count we see the same behavior in the prometheus web ui.

When having the --apiserver-count set on the api-server deployment/daemonset we see the correct set of scrapes for the targets.

Having said that --apiserver-count has issues in k8s upstream (see [1], [2]), hence I suggest to either:

  • set --apiserver-count and have a working k8s scrape and even have and indicator if one of the endpoint goes down (as indicated in [1]). If you use use the --kubeconfig option on the kube-proxy and your api-server communication goes via an external LB, even [2] can't be an issue.
  • do not set --apiserver-count and do not scrape the api-server, which is unfortunate.

[1] kubernetes/kubernetes#22609
[2] kubernetes/kubernetes#18174

@cemo

This comment has been minimized.

Copy link
Author

cemo commented Apr 4, 2017

thanks @s-urbaniak

  1. What about statically scraping? Is there a downside of this approach?
  2. Please consider me as a completely new to prometheus. I had to remove relabel_configs as well. I have no idea about implications of this removal as well. Should I provide relabel config in a different way?
@brancz

This comment has been minimized.

Copy link
Member

brancz commented Apr 4, 2017

Static scraping is totally an option, if you know they won't change 🙂. But aside from that there may be other problems if you don't set the --apiserver-count flag, basically any other situation where an apiserver instance needs to have that knowledge.

I wouldn't recommend removing the relabeling rules if you are basing your configuration on the example config and are not too familiar with Prometheus.

@gianrubio

This comment has been minimized.

Copy link

gianrubio commented May 24, 2017

Today I have the same issue. I have 2 replicas (v1.6.2) running in different nodes
One instance was smoothly running but the other was stucked with all targets in unknown state.

I was reading this thread and accidentally I restarted the apiserver. After that the stucked prometheus starting scrapping again. I guess this is not only related to apiserver-count

ps. I just have 1 apiserver replica

Logs when the instance was stucked

time="2017-05-24T07:42:54Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-05-24T07:43:01Z" level=info msg="Done checkpointing in-memory metrics and chunks in 7.229965332s." source="persistence.go:665"
time="2017-05-24T07:43:07Z" level=info msg="Loading configuration file /etc/prometheus/config/prometheus.yaml" source="main.go:251"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:43:07Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Loading configuration file /etc/prometheus/config/prometheus.yaml" source="main.go:251"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:44:11Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Loading configuration file /etc/prometheus/config/prometheus.yaml" source="main.go:251"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
time="2017-05-24T07:45:13Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:94"
@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Feb 17, 2018

@brian-brazil, @brancz reading the discussion this seems to be a problem on the k8s side.
Maybe close it and will reopen if anyone can replicate this against latest master?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 17, 2018

We haven't had any recent reports, so that sounds sane.

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 17, 2018

For anyone wondering, the correct way to solve this is to enable the “lease” Endpoints reconciler on your Kubernetes API server. Then everything will work in Prometheus as expected.

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 17, 2018

The lease Endpoints reconciler is available in Kubernetes 1.9 as alpha, meaning it must be explicitly enabled.

@kinghrothgar

This comment has been minimized.

Copy link

kinghrothgar commented Mar 6, 2018

I am on GKE and thus cannot use features until they are beta. As far as I can tell the lease endpoints fix won't be beta until 1.11 which is quite a ways away from being released or hitting GKE. Are there no solutions for this except for this alpha feature?

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Mar 6, 2018

Unless you have a single master, this problem will persist until it's available as beta in GKE unfortunately for those users. I know that the requirements for beta are being worked on, but are not landing in 1.10, they're targeted for 1.11.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.