Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expand series set: series not found #3188

Closed
JorritSalverda opened this Issue Sep 18, 2017 · 5 comments

Comments

Projects
None yet
4 participants
@JorritSalverda
Copy link
Contributor

JorritSalverda commented Sep 18, 2017

Since testing version 2.0 beta we see more gap than graph in a lot of our Grafana graphs, making a cpu graph look like this.

20170918-prometheus-2-cpu-graph-with-gaps

The graph uses the following query:

sum(irate(container_cpu_usage_seconds_total{container_name="prometheus"}[30s])) by (container_name) 
/
sum(container_spec_cpu_shares{container_name="prometheus"} / 1024) by (container_name)

When graphing this in the Prometheus GUI it shows the same gaps. Increasing the duration for irate doesn't help.

20170918-prometheus-2-cpu-graph-with-gaps-in-prometheus

Both sum(irate(container_cpu_usage_seconds_total{container_name="prometheus"}[30s])) by (container_name) and sum(container_spec_cpu_shares{container_name="prometheus"} / 1024) by (container_name) exhibit gaps, although not necessarily at the same time, so the resulting query has even more gaps.

In our Prometheus server we use the following intervals:

global:
  scrape_interval: 10s
  scrape_timeout: 10s
  evaluation_interval: 10s

When counting the number of values recorded in 5 minutes using container_cpu_usage_seconds_total{container_name="prometheus"}[5m] it only has about 13 of them. That's approximately every 25 seconds, not 10 as configured.

In Prometheus 1.7.1 and before these graphs looked fined. I'm not sure whether the actual interval was any different or the duration for which the last value of a timeline shows up in a query has decreased in 2.0.

Any ideas what causes it and how to fix?

Environment

  • System information:

Linux 4.4.64+ x86_64

  • Prometheus version:
prometheus, version 2.0.0-beta.3 (branch: HEAD, revision: 066783b3991dd64729325fc4f880dfffb484a2c2)
  build user:       root@0cbc320660dc
  build date:       20170912-10:17:45
  go version:       go1.8.3
  • Prometheus configuration file:
global:
  scrape_interval: 10s
  scrape_timeout: 10s
  evaluation_interval: 10s
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - alertmanager-0.alertmanager-headless:9093
      - alertmanager-1.alertmanager-headless:9093
      - alertmanager-2.alertmanager-headless:9093
    scheme: http
    timeout: 10s
rule_files:
- /prometheus-rules/alert.yaml
- /prometheus-rules/aggregation.yaml
scrape_configs:
- job_name: kubernetes-apiservers
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
  - api_server: null
    role: endpoints
    namespaces:
      names: []
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: false
  relabel_configs:
  - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: default;kubernetes;https
    replacement: $1
    action: keep
- job_name: kubernetes-nodes
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
  - api_server: null
    role: node
    namespaces:
      names: []
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: false
  relabel_configs:
  - separator: ;
    regex: __meta_kubernetes_node_label_(.+)
    replacement: $1
    action: labelmap
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: kubernetes.default.svc:443
    action: replace
  - source_labels: [__meta_kubernetes_node_name]
    separator: ;
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/${1}/proxy/metrics
    action: replace
- job_name: kubernetes-cadvisor
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
  - api_server: null
    role: node
    namespaces:
      names: []
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: false
  relabel_configs:
  - separator: ;
    regex: __meta_kubernetes_node_label_(.+)
    replacement: $1
    action: labelmap
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: kubernetes.default.svc:443
    action: replace
  - source_labels: [__meta_kubernetes_node_name]
    separator: ;
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/${1}:4194/proxy/metrics
    action: replace
- job_name: kubernetes-service-endpoints
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - api_server: null
    role: endpoints
    namespaces:
      names: []
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
    separator: ;
    regex: "true"
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
    separator: ;
    regex: (https?)
    target_label: __scheme__
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
    separator: ;
    regex: (.+)
    target_label: __metrics_path__
    replacement: $1
    action: replace
  - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
    separator: ;
    regex: ([^:;]+);(\d+)
    target_label: __address__
    replacement: ${1}:${2}
    action: replace
  - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
    separator: ;
    regex: ([^:;]+):(\d+);(\d+)
    target_label: __address__
    replacement: ${1}:${3}
    action: replace
  - separator: ;
    regex: __meta_kubernetes_service_label_(.+)
    replacement: $1
    action: labelmap
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: kubernetes_namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: kubernetes_name
    replacement: $1
    action: replace
- job_name: kubernetes-services
  params:
    module:
    - http_2xx
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: /probe
  scheme: http
  kubernetes_sd_configs:
  - api_server: null
    role: service
    namespaces:
      names: []
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
    separator: ;
    regex: "true"
    replacement: $1
    action: keep
  - source_labels: [__address__]
    separator: ;
    regex: (.*?):(:80|:443)
    replacement: $1
    action: keep
  - source_labels: [__address__]
    separator: ;
    regex: (.*?):80
    target_label: __param_target
    replacement: http://${1}
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*?):443
    target_label: __param_target
    replacement: https://${1}
    action: replace
  - source_labels: [__param_target, __meta_kubernetes_service_annotation_prometheus_io_probe_path]
    separator: ;
    regex: (.*?);(.*?)
    target_label: __param_target
    replacement: ${1}${2}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: blackbox-exporter
    action: replace
  - source_labels: [__param_target]
    separator: ;
    regex: (.*)
    target_label: instance
    replacement: $1
    action: replace
  - separator: ;
    regex: __meta_kubernetes_service_label_(.+)
    replacement: $1
    action: labelmap
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: kubernetes_namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: kubernetes_name
    replacement: $1
    action: replace
- job_name: kubernetes-pods
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - api_server: null
    role: pod
    namespaces:
      names: []
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    separator: ;
    regex: "true"
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    separator: ;
    regex: (.+)
    target_label: __metrics_path__
    replacement: $1
    action: replace
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    separator: ;
    regex: ([^:;]+);(\d+)
    target_label: __address__
    replacement: ${1}:${2}
    action: replace
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    separator: ;
    regex: ([^:;]+):(\d+);(\d+)
    target_label: __address__
    replacement: ${1}:${3}
    action: replace
  - separator: ;
    regex: __meta_kubernetes_pod_label_(.+)
    replacement: $1
    action: labelmap
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: kubernetes_namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: kubernetes_pod_name
    replacement: $1
    action: replace
- job_name: kubernetes-nginx-sidecar
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - api_server: null
    role: pod
    namespaces:
      names: []
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape_nginx_sidecar]
    separator: ;
    regex: "true"
    replacement: $1
    action: keep
  - source_labels: [__address__]
    separator: ;
    regex: (.*):(\d+)
    target_label: __address__
    replacement: ${1}:9101
    action: replace
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_scrape_nginx_sidecar_port]
    separator: ;
    regex: ([^:;]+):(\d+);(\d+)
    target_label: __address__
    replacement: ${1}:${3}
    action: replace
  - separator: ;
    regex: __meta_kubernetes_pod_label_(.+)
    replacement: $1
    action: labelmap
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: kubernetes_namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: kubernetes_pod_name
    replacement: $1
    action: replace
- job_name: gce-vms
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  gce_sd_configs:
  - project: travix-production
    zone: europe-west1-c
    refresh_interval: 1m
    port: 9101
    tag_separator: ','
  relabel_configs:
  - source_labels: [__meta_gce_tags]
    separator: ;
    regex: .*,prometheus,.*
    replacement: $1
    action: keep
  - source_labels: [__meta_gce_network]
    separator: ;
    regex: .*/production
    replacement: $1
    action: keep
  - source_labels: [__meta_gce_instance_name]
    separator: ;
    regex: (.*)
    target_label: instance
    replacement: $1
    action: replace
  - source_labels: [__meta_gce_tags]
    separator: ;
    regex: .*,app-([^,]+),.*
    target_label: app
    replacement: ${1}
    action: replace
  - source_labels: [__meta_gce_tags]
    separator: ;
    regex: .*,team-([^,]+),.*
    target_label: team
    replacement: ${1}
    action: replace
  - source_labels: [__meta_gce_tags]
    separator: ;
    regex: .*,version-([^,]+),.*
    target_label: version
    replacement: ${1}
    action: replace
- job_name: gce-vms-win-metrics
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  gce_sd_configs:
  - project: travix-production
    zone: europe-west1-c
    refresh_interval: 1m
    port: 9182
    tag_separator: ','
  relabel_configs:
  - source_labels: [__meta_gce_tags]
    separator: ;
    regex: .*,prometheus,.*
    replacement: $1
    action: keep
  - source_labels: [__meta_gce_network]
    separator: ;
    regex: .*/production
    replacement: $1
    action: keep
  - source_labels: [__meta_gce_instance_name]
    separator: ;
    regex: (.*)
    target_label: instance
    replacement: $1
    action: replace
  - source_labels: [__meta_gce_tags]
    separator: ;
    regex: .*,app-([^,]+),.*
    target_label: app
    replacement: ${1}
    action: replace
  - source_labels: [__meta_gce_tags]
    separator: ;
    regex: .*,team-([^,]+),.*
    target_label: team
    replacement: ${1}
    action: replace
  - source_labels: [__meta_gce_tags]
    separator: ;
    regex: .*,version-([^,]+),.*
    target_label: version
    replacement: ${1}
    action: replace
  • Logs:
time="2017-09-16T15:38:38Z" level=info msg="Starting prometheus (version=2.0.0-beta.3, branch=HEAD, revision=066783b3991dd64729325fc4f880dfffb484a2c2)" source="main.go:210"
time="2017-09-16T15:38:38Z" level=info msg="Build context (go=go1.8.3, user=root@0cbc320660dc, date=20170912-10:17:45)" source="main.go:211"
time="2017-09-16T15:38:38Z" level=info msg="Host details (Linux 4.4.64+ #1 SMP Thu Aug 17 02:01:54 PDT 2017 x86_64 prometheus-1 (none))" source="main.go:212"
time="2017-09-16T15:38:38Z" level=info msg="Starting tsdb" source="main.go:224"
time="2017-09-16T15:38:47Z" level=info msg="tsdb started" source="main.go:230"
time="2017-09-16T15:38:47Z" level=info msg="Loading configuration file /prometheus-config/prometheus.yml" source="main.go:363"
time="2017-09-16T15:38:47Z" level=info msg="Server is ready to receive requests." source="main.go:340"
time="2017-09-16T15:38:47Z" level=info msg="Starting target manager..." source="targetmanager.go:67"
time="2017-09-16T15:38:47Z" level=info msg="Listening on 0.0.0.0:9090" source="web.go:359"
time="2017-09-16T15:38:47Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:105"
time="2017-09-16T15:38:47Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:105"
time="2017-09-16T15:38:47Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:105"
time="2017-09-16T15:38:47Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:105"
time="2017-09-16T15:38:47Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:105"
time="2017-09-16T15:38:47Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:105"
time="2017-09-16T15:38:47Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:105"
ts=2017-09-16T17:00:00.121011859Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505570400000 maxt=1505577600000
time="2017-09-16T17:00:03Z" level=error msg="expand series set: not found" source="engine.go:531"
time="2017-09-16T17:00:03Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-16T17:00:03Z" level=warning msg="Error while evaluating rule "alert: NginxErrorRateAboveThreshold\nexpr: sum(rate(nginx_http_requests_total{host!~\"^(?:^(?:.*svc)$)$\",host!~\"^(?:^(?:[0
-9.]+)$)$\",status=~\"^(?:^(?:5[0-9]+)$)$\"}[1m]))\n  BY (app, team, kubernetes_namespace) / sum(rate(nginx_http_requests_total{host!~\"^(?:^(?:.*svc)$)$\",host!~\"^(?:^(?:[0-9.]+)$)$\"}[1m]))\n  BY (
app, team, kubernetes_namespace) > 0.02\nfor: 5m\nlabels:\n  gcloud_project: verdant-current-104210\n  kubernetes_cluster: tooling\n  severity: page\nannotations:\n  description: Error rate of {{ $lab
els.app }}.{{ $labels.kubernetes_namespace }}.tooling.verdant-current-104210\n    is above threshold of 2% for more than 5 minutes.\n  summary: Error rate of {{ $labels.app }} is above threshold\n": s
eries not found" group=alert.rules source="manager.go:311"
time="2017-09-16T17:00:03Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-16T17:00:03Z" level=warning msg="Error while evaluating rule "alert: InstrumentedErrorRateAboveThreshold\nexpr: sum(rate(http_requests_total{code=~\"^(?:^(?:5[0-9]+)$)$\"}[1m])) BY (app,
 team,\n  kubernetes_namespace) / sum(irate(http_requests_total[1m])) BY (app, team, kubernetes_namespace)\n  > 0.02\nfor: 5m\nlabels:\n  gcloud_project: verdant-current-104210\n  kubernetes_cluster:
tooling\n  severity: page\nannotations:\n  description: Error rate of {{ $labels.app }}.{{ $labels.kubernetes_namespace }}.tooling.verdant-current-104210\n    is above threshold of 2% for more than 5
minutes.\n  summary: Error rate of {{ $labels.app }} is above threshold\n": series not found" group=alert.rules source="manager.go:311"
time="2017-09-16T17:00:03Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-16T17:00:03Z" level=warning msg="Error while evaluating rule "alert: InstanceDown\nexpr: up{app=~\"^(?:^(?:.*cockroach.*)$)$\"} == 0\nfor: 10m\nlabels:\n  gcloud_project: verdant-current
-104210\n  kubernetes_cluster: tooling\n  severity: page\nannotations:\n  description: '{{ $labels.kubernetes_pod_name }} for cluster {{ $labels.app }}.{{\n    $labels.kubernetes_namespace }}.tooling.
verdant-current-104210 has been down for\n    more than 5 minutes.'\n  summary: Node {{ $labels.kubernetes_pod_name }} down\n": series not found" group=alert.rules source="manager.go:311"
time="2017-09-16T17:00:03Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-16T17:00:03Z" level=warning msg="Error while evaluating rule "alert: InstanceDead\nexpr: up{app=~\"^(?:^(?:.*cockroach.*)$)$\"} == 0\nfor: 15m\nlabels:\n  gcloud_project: verdant-current
-104210\n  kubernetes_cluster: tooling\n  severity: page\nannotations:\n  description: '{{ $labels.kubernetes_pod_name }} for cluster {{ $labels.app }}.{{\n    $labels.kubernetes_namespace }}.tooling.
verdant-current-104210 has been down for\n    more than 15 minutes.'\n  summary: Node {{ $labels.kubernetes_pod_name }} dead\n": series not found" group=alert.rules source="manager.go:311"
time="2017-09-16T17:00:03Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-16T17:00:03Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-16T17:00:03Z" level=warning msg="Error while evaluating rule "alert: InstanceRestart\nexpr: resets(sys_uptime{app=~\"^(?:^(?:.*cockroach.*)$)$\"}[10m]) > 0 and resets(sys_uptime{app=~\"^
(?:^(?:.*cockroach.*)$)$\"}[10m])\n  < 5\nlabels:\n  gcloud_project: verdant-current-104210\n  kubernetes_cluster: tooling\n  severity: page\nannotations:\n  description: '{{ $labels.kubernetes_pod_na
me }} for cluster {{ $labels.app }}.{{\n    $labels.kubernetes_namespace }}.tooling.verdant-current-104210 restarted {{ $value\n    }} time(s) in 10m'\n  summary: Node {{ $labels.kubernetes_pod_name }
} restarted\n": series not found" group=alert.rules source="manager.go:311"
time="2017-09-16T17:00:03Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-16T17:00:03Z" level=warning msg="Error while evaluating rule "alert: InstanceFlapping\nexpr: resets(sys_uptime{app=~\"^(?:^(?:.*cockroach.*)$)$\"}[10m]) > 5\nlabels:\n  gcloud_project: v
erdant-current-104210\n  kubernetes_cluster: tooling\n  severity: page\nannotations:\n  description: '{{ $labels.kubernetes_pod_name }} for cluster {{ $labels.app }}.{{\n    $labels.kubernetes_namespa
ce }}.tooling.verdant-current-104210 restarted {{ $value\n    }} time(s) in 10m'\n  summary: Node {{ $labels.kubernetes_pod_name }} flapping\n": series not found" group=alert.rules source="manager.go:
311"
time="2017-09-16T17:00:03Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-16T17:00:03Z" level=warning msg="Error while evaluating rule "alert: VersionMismatch\nexpr: count(count_values(\"version\", build_timestamp{app=~\"^(?:^(?:.*cockroach.*)$)$\"})\n  BY (ta
g, cluster)) BY (cluster) > 1\nfor: 30m\nlabels:\n  gcloud_project: verdant-current-104210\n  kubernetes_cluster: tooling\n  severity: page\nannotations:\n  description: Cluster {{ $labels.app }}.{{ $
labels.kubernetes_namespace }}.tooling.verdant-current-104210\n    running {{ $value }} different versions\n  summary: Binary version mismatch on {{ $labels.app }}\n": series not found" group=alert.ru
les source="manager.go:311"
time="2017-09-16T17:00:03Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-16T17:00:03Z" level=warning msg="Error while evaluating rule "alert: StoreDiskLow\nexpr: capacity_available:ratio{app=~\"^(?:^(?:.*cockroach.*)$)$\"} < 0.15\nlabels:\n  gcloud_project: v
erdant-current-104210\n  kubernetes_cluster: tooling\n  severity: page\nannotations:\n  summary: Store {{ $labels.store }} on node {{ $labels.kubernetes_pod_name }} for\n    cluster {{ $labels.app }}.
{{ $labels.kubernetes_namespace }}.tooling.verdant-current-104210\n    at {{ $value }} available disk fraction\n": series not found" group=alert.rules source="manager.go:311"
ts=2017-09-16T17:00:03.026528428Z caller=head.go:261 msg="head GC completed" duration=293.435328ms
ts=2017-09-16T17:00:03.255821097Z caller=head.go:272 msg="WAL truncation completed" duration=229.196894ms
ts=2017-09-16T19:00:00.030720958Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505577600000 maxt=1505584800000
ts=2017-09-16T19:00:03.43300958Z caller=head.go:261 msg="head GC completed" duration=175.421012ms
ts=2017-09-16T19:00:04.076700459Z caller=head.go:272 msg="WAL truncation completed" duration=643.603434ms
ts=2017-09-16T19:00:04.331017588Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505563200000 maxt=1505584800000
ts=2017-09-16T21:00:00.030595634Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505584800000 maxt=1505592000000
ts=2017-09-16T21:00:03.320412486Z caller=head.go:261 msg="head GC completed" duration=165.4321ms
ts=2017-09-16T21:00:03.966277448Z caller=head.go:272 msg="WAL truncation completed" duration=645.774123ms
ts=2017-09-16T23:00:00.03276241Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505592000000 maxt=1505599200000
ts=2017-09-16T23:00:03.34271335Z caller=head.go:261 msg="head GC completed" duration=168.738919ms
ts=2017-09-16T23:00:03.937405137Z caller=head.go:272 msg="WAL truncation completed" duration=594.599442ms
ts=2017-09-17T01:00:00.126838864Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505599200000 maxt=1505606400000
ts=2017-09-17T01:00:03.749528094Z caller=head.go:261 msg="head GC completed" duration=163.397408ms
ts=2017-09-17T01:00:04.349033427Z caller=head.go:272 msg="WAL truncation completed" duration=599.407625ms
ts=2017-09-17T01:00:04.619915736Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505584800000 maxt=1505606400000
ts=2017-09-17T03:00:00.029769238Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505606400000 maxt=1505613600000
ts=2017-09-17T03:00:04.147136212Z caller=head.go:261 msg="head GC completed" duration=206.656102ms
ts=2017-09-17T03:00:04.77993481Z caller=head.go:272 msg="WAL truncation completed" duration=632.716689ms
ts=2017-09-17T05:00:00.029738015Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505613600000 maxt=1505620800000
ts=2017-09-17T05:00:03.267361526Z caller=head.go:261 msg="head GC completed" duration=177.414873ms
ts=2017-09-17T05:00:04.000056095Z caller=head.go:272 msg="WAL truncation completed" duration=732.596708ms
ts=2017-09-17T07:00:00.02964286Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505620800000 maxt=1505628000000
ts=2017-09-17T07:00:03.371220878Z caller=head.go:261 msg="head GC completed" duration=164.606252ms
ts=2017-09-17T07:00:03.96838762Z caller=head.go:272 msg="WAL truncation completed" duration=597.087237ms
ts=2017-09-17T07:00:04.217034857Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505606400000 maxt=1505628000000
ts=2017-09-17T07:00:09.298484068Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505563200000 maxt=1505628000000
ts=2017-09-17T09:00:00.031140488Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505628000000 maxt=1505635200000
ts=2017-09-17T09:00:03.461716643Z caller=head.go:261 msg="head GC completed" duration=166.111875ms
ts=2017-09-17T09:00:04.08450718Z caller=head.go:272 msg="WAL truncation completed" duration=622.705666ms
ts=2017-09-17T11:00:00.15389303Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505635200000 maxt=1505642400000
ts=2017-09-17T11:00:03.838836928Z caller=head.go:261 msg="head GC completed" duration=163.985811ms
ts=2017-09-17T11:00:04.669778532Z caller=head.go:272 msg="WAL truncation completed" duration=830.850321ms
ts=2017-09-17T13:00:00.029945167Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505642400000 maxt=1505649600000
ts=2017-09-17T13:00:03.497420989Z caller=head.go:261 msg="head GC completed" duration=159.868615ms
ts=2017-09-17T13:00:04.14198311Z caller=head.go:272 msg="WAL truncation completed" duration=644.489188ms
ts=2017-09-17T13:00:04.433436545Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505628000000 maxt=1505649600000
ts=2017-09-17T15:00:00.030486721Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505649600000 maxt=1505656800000
ts=2017-09-17T15:00:04.011534578Z caller=head.go:261 msg="head GC completed" duration=227.407467ms
ts=2017-09-17T15:00:04.663249092Z caller=head.go:272 msg="WAL truncation completed" duration=651.512606ms
ts=2017-09-17T17:00:00.031511684Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505656800000 maxt=1505664000000
ts=2017-09-17T17:00:03.464768172Z caller=head.go:261 msg="head GC completed" duration=159.154863ms
ts=2017-09-17T17:00:04.098231645Z caller=head.go:272 msg="WAL truncation completed" duration=633.382339ms
ts=2017-09-17T19:00:00.028323555Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505664000000 maxt=1505671200000
ts=2017-09-17T19:00:03.757406885Z caller=head.go:261 msg="head GC completed" duration=172.2514ms
ts=2017-09-17T19:00:04.645172322Z caller=head.go:272 msg="WAL truncation completed" duration=887.540964ms
ts=2017-09-17T19:00:04.873426811Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505649600000 maxt=1505671200000
ts=2017-09-17T21:00:00.029698307Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505671200000 maxt=1505678400000
ts=2017-09-17T21:00:03.506003887Z caller=head.go:261 msg="head GC completed" duration=174.821262ms
ts=2017-09-17T21:00:04.14394777Z caller=head.go:272 msg="WAL truncation completed" duration=637.724454ms
ts=2017-09-17T23:00:00.030436027Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505678400000 maxt=1505685600000
ts=2017-09-17T23:00:03.410509831Z caller=head.go:261 msg="head GC completed" duration=183.421622ms
ts=2017-09-17T23:00:04.123474992Z caller=head.go:272 msg="WAL truncation completed" duration=712.739418ms
ts=2017-09-18T01:00:00.030073996Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505685600000 maxt=1505692800000
ts=2017-09-18T01:00:03.470477189Z caller=head.go:261 msg="head GC completed" duration=165.40148ms
ts=2017-09-18T01:00:04.112053923Z caller=head.go:272 msg="WAL truncation completed" duration=641.501625ms
ts=2017-09-18T01:00:04.403707195Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505671200000 maxt=1505692800000
ts=2017-09-18T01:00:09.22330825Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505628000000 maxt=1505692800000
ts=2017-09-18T03:00:00.030643685Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505692800000 maxt=1505700000000
ts=2017-09-18T03:00:04.057486296Z caller=head.go:261 msg="head GC completed" duration=235.75439ms
ts=2017-09-18T03:00:04.772960383Z caller=head.go:272 msg="WAL truncation completed" duration=715.370473ms
ts=2017-09-18T05:00:00.029338302Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505700000000 maxt=1505707200000
ts=2017-09-18T05:00:03.874180984Z caller=head.go:261 msg="head GC completed" duration=151.417652ms
ts=2017-09-18T05:00:04.553170326Z caller=head.go:272 msg="WAL truncation completed" duration=678.912535ms
ts=2017-09-18T07:00:00.029878092Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505707200000 maxt=1505714400000
ts=2017-09-18T07:00:03.723359141Z caller=head.go:261 msg="head GC completed" duration=187.705278ms
ts=2017-09-18T07:00:04.69453479Z caller=head.go:272 msg="WAL truncation completed" duration=971.050782ms
ts=2017-09-18T07:00:04.981057732Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505692800000 maxt=1505714400000
ts=2017-09-18T09:00:00.029834514Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505714400000 maxt=1505721600000
ts=2017-09-18T09:00:05.850008228Z caller=head.go:261 msg="head GC completed" duration=321.229482ms
ts=2017-09-18T09:00:08.401653152Z caller=head.go:272 msg="WAL truncation completed" duration=2.55156192s
ts=2017-09-18T11:00:00.035141156Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505721600000 maxt=1505728800000
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T11:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
ts=2017-09-18T11:00:07.960065481Z caller=head.go:261 msg="head GC completed" duration=442.940484ms
ts=2017-09-18T11:00:10.222674613Z caller=head.go:272 msg="WAL truncation completed" duration=2.262414585s
ts=2017-09-18T13:00:00.030416342Z caller=compact.go:359 msg="compact blocks" count=1 mint=1505728800000 maxt=1505736000000
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:531"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
time="2017-09-18T13:00:07Z" level=error msg="expand series set: series not found" source="engine.go:520"
ts=2017-09-18T13:00:07.809911408Z caller=head.go:261 msg="head GC completed" duration=396.72313ms
ts=2017-09-18T13:00:09.837225459Z caller=head.go:272 msg="WAL truncation completed" duration=2.027237502s
ts=2017-09-18T13:00:10.328382626Z caller=compact.go:359 msg="compact blocks" count=3 mint=1505714400000 maxt=1505736000000

@JorritSalverda JorritSalverda changed the title Actual interval larger then defined in version 2 beta 3 Gaps in data in version 2 (beta 3) Sep 18, 2017

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Sep 18, 2017

This is due to the new staleness handling. The actual issue here is likely that those are cAdvisor metrics, which are broken as of recent versions, see: google/cadvisor#1704

However, the 'series not found' errors are likely unrelated and a bit concerning.

@JorritSalverda

This comment has been minimized.

Copy link
Contributor Author

JorritSalverda commented Sep 18, 2017

I'm indeed using cAdvisor as bundled in Google Container Engine, so I'll wait for the fix to be included in a future release.

I think (at least part of) the 'series not found' errors are caused by alert rules I use in all my Prometheus servers, but are only applicable for some of them. Mostly for CockroachDB which I only run in one of my clusters.

Do alert and aggregation rules looking for non-existing timeline series indeed result in that kind of error?

@brian-brazil brian-brazil changed the title Gaps in data in version 2 (beta 3) expand series set: series not found Sep 28, 2017

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Oct 18, 2017

@JorritSalverda @grobie could you report whether this is resolved in rc.1?

@grobie

This comment has been minimized.

Copy link
Member

grobie commented Oct 18, 2017

Confirmed!

@fabxc fabxc closed this Oct 18, 2017

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.