Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing data #2884

Closed
thomas-mangin opened this Issue Jun 28, 2017 · 5 comments

Comments

Projects
None yet
3 participants
@thomas-mangin
Copy link

thomas-mangin commented Jun 28, 2017

What did you do?

Configured prometheus 2.0.0 alpha 2 (and then alpha3 - which totally changed the command line 🙈) to save a small number of samples but for a long duration. The command line used is:

/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml --storage.tsdb.retention 14w --storage.tsdb.min-block-duration 15m --storage.tsdb.max-block-duration 1d --log.level=info

And a configuration of

global:
  scrape_interval:     1m
  evaluation_interval: 1m
  scrape_timeout:      1m

  external_labels:
      monitor: 'servers'

rule_files:
  # - "first.rules"

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['0.0.0.0:9090']

  - job_name: 'my-job'
    scrape_interval: 2m
    dns_sd_configs:
    - names:
      - '_prometheus._tcp.domain.com'
    relabel_configs:
    - source_labels: ['__meta_dns_srv_name']
      regex:         '([^-.]+)-([^-.]+)-([^-.]+)\..*'
      target_label:  'city'
      replacement:   '$1'
    - source_labels: ['__meta_dns_srv_name']
      regex:         '([^-.]+)-([^-.]+)-([^-.]+)\..*'
      target_label:  'pop'
      replacement:   '$2'
    - source_labels: ['__meta_dns_srv_name']
      regex:         '([^-.]+)-([^-.]+)-([^-.]+)\..*'
      target_label:  'node'
      replacement:   '$3'

What did you expect to see?

regular storage of data, datapoint collected every 2minutes (or 1minute for prometheus metrics)

What did you see instead? Under which circumstances?

Long periods of time when data is not collected (on/off/on/off/...) including prometheus own data (go_, prometheus_, etc). Data is available / not-present by blocks of several hours.

Environment

Plain ubuntu, installation with prometheus extracted in /opt/

# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.2 LTS
Release:	16.04
Codename:	xenial
  • System information:
# uname -srm
Linux 4.4.0-31-generic x86_64

# uname -a
Linux prometheus 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux`
# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            7.9G     0  7.9G   0% /dev
tmpfs           1.6G  121M  1.5G   8% /run
/dev/vda1       969G  4.3G  915G   1% /
tmpfs           7.8G     0  7.8G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           7.8G     0  7.8G   0% /sys/fs/cgroup
tmpfs           100K     0  100K   0% /run/lxcfs/controllers
tmpfs           1.6G     0  1.6G   0% /run/user/0
# free -g
              total        used        free      shared  buff/cache   available
Mem:             15           0          11           0           3          14
Swap:            15           0          15
  • Prometheus version:
prometheus, version 2.0.0-alpha.3 (branch: master, revision: 70f96b0ffb6567100ffc91f7c3fe4e57c8d9dedb)
  build user:       root@5630fb1ab539
  build date:       20170622-10:04:46
  go version:       go1.8.3
  • Logs:
Jun 28 09:00:44 prometheus prometheus[17326]: ts=2017-06-28T08:00:44.980597902Z caller=compact.go:235 msg="compact blocks" blocks=[01BKPSR223E8NGSGNMVVNMSP6Q]
Jun 28 09:00:46 prometheus prometheus[17326]: ts=2017-06-28T08:00:46.547653328Z caller=compact.go:235 msg="compact blocks" blocks="[01BKPSR23AFCETM0SG6HPXFSGG 01BKPTJKPN8JSSBHWE1PM6QERW 01BKPVEZXMS5EYSQ2SMPAFXZHY]"
Jun 28 09:15:14 prometheus prometheus[17326]: ts=2017-06-28T08:15:14.967744726Z caller=compact.go:235 msg="compact blocks" blocks=[01BKPTJKMS2VARG8XB3CBEBCCN]
Jun 28 09:30:44 prometheus prometheus[17326]: ts=2017-06-28T08:30:44.985301651Z caller=compact.go:235 msg="compact blocks" blocks=[01BKPVEZVZ0MQJ9A068KXA3HTR]
Jun 28 09:45:15 prometheus prometheus[17326]: ts=2017-06-28T08:45:15.578283489Z caller=compact.go:235 msg="compact blocks" blocks=[01BKPW9HFFBD5RKX09N18Q6S5H]
Jun 28 09:45:17 prometheus prometheus[17326]: ts=2017-06-28T08:45:17.5172516Z caller=compact.go:235 msg="compact blocks" blocks="[01BKPW9HGQNVBJRSMJN6P7C24A 01BKPX5XQS8H8109WS0QH4AYYR 01BKPY0FXTW5YPBXC1QC8SV594]"
Jun 28 10:00:44 prometheus prometheus[17326]: ts=2017-06-28T09:00:44.977916884Z caller=compact.go:235 msg="compact blocks" blocks=[01BKPX5XNYYZAWT93W2NE646XB]
Jun 28 10:04:19 prometheus prometheus[17326]: time="2017-06-28T10:04:19+01:00" level=warning msg="Received SIGTERM, exiting gracefully..." source="main.go:327"
Jun 28 10:04:19 prometheus prometheus[17326]: time="2017-06-28T10:04:19+01:00" level=info msg="See you next time!" source="main.go:334"
Jun 28 10:04:19 prometheus prometheus[17326]: time="2017-06-28T10:04:19+01:00" level=info msg="Stopping target manager..." source="targetmanager.go:81"
Jun 28 10:04:19 prometheus prometheus[17326]: time="2017-06-28T10:04:19+01:00" level=info msg="Stopping rule manager..." source="manager.go:454"
Jun 28 10:04:19 prometheus prometheus[17326]: time="2017-06-28T10:04:19+01:00" level=info msg="Rule manager stopped." source="manager.go:460"
Jun 28 10:04:19 prometheus prometheus[17326]: time="2017-06-28T10:04:19+01:00" level=info msg="Stopping notification handler..." source="notifier.go:471"
Jun 28 10:04:22 prometheus prometheus[13026]: time="2017-06-28T10:04:22+01:00" level=info msg="Starting prometheus (version=2.0.0-alpha.3, branch=master, revision=70f96b0ffb6567100ffc91f7c3fe4e57c8d9dedb)" source="main.go:196"
Jun 28 10:04:22 prometheus prometheus[13026]: time="2017-06-28T10:04:22+01:00" level=info msg="Build context (go=go1.8.3, user=root@5630fb1ab539, date=20170622-10:04:46)" source="main.go:197"
Jun 28 10:04:22 prometheus prometheus[13026]: time="2017-06-28T10:04:22+01:00" level=info msg="Host details (Linux 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 prometheus (none))" source="main.go:198"
Jun 28 10:04:22 prometheus prometheus[13026]: time="2017-06-28T10:04:22+01:00" level=info msg="Starting tsdb" source="main.go:210"
Jun 28 10:04:22 prometheus prometheus[13026]: time="2017-06-28T10:04:22+01:00" level=info msg="tsdb started" source="main.go:216"
Jun 28 10:04:22 prometheus prometheus[13026]: time="2017-06-28T10:04:22+01:00" level=info msg="Loading configuration file /opt/prometheus/prometheus.yml" source="main.go:344"
Jun 28 10:04:22 prometheus prometheus[13026]: time="2017-06-28T10:04:22+01:00" level=info msg="Listening on 0.0.0.0:9090" source="web.go:259"
Jun 28 10:04:22 prometheus prometheus[13026]: time="2017-06-28T10:04:22+01:00" level=info msg="Starting target manager..." source="targetmanager.go:67"
Jun 28 10:15:06 prometheus prometheus[13026]: ts=2017-06-28T09:15:06.889135161Z caller=compact.go:235 msg="compact blocks" blocks=[01BKPY0FVWVPNQ0FAVTDMYTNF4]
Jun 28 10:30:36 prometheus prometheus[13026]: ts=2017-06-28T09:30:36.848152784Z caller=compact.go:235 msg="compact blocks" blocks=[01BKPYWVG04WQB027YHV8RGGRQ]
Jun 28 10:30:37 prometheus prometheus[13026]: ts=2017-06-28T09:30:37.218956526Z caller=compact.go:235 msg="compact blocks" blocks="[01BKPYWVHHQG9SVVPDJV5YEMAP 01BKPZQ589HN5C6YWN1140RT4M 01BKQ0KHDGQG1JE7EE6FBKBBKC]"
Jun 28 10:30:37 prometheus prometheus[13026]: ts=2017-06-28T09:30:37.669783753Z caller=compact.go:235 msg="compact blocks" blocks="[01BKPVF1EK2S2J5XK71QKP6ZQY 01BKPY0HTDMVFHJNBFG94QNN96 01BKQ0KHS27Q1KZE98DWKGQ4Q0]"
Jun 28 10:45:07 prometheus prometheus[13026]: ts=2017-06-28T09:45:07.531736233Z caller=compact.go:235 msg="compact blocks" blocks=[01BKPZQ56D26RFW83TV43YWGVD]
Jun 28 11:00:36 prometheus prometheus[13026]: ts=2017-06-28T10:00:36.882579978Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ0KHBYABVATFCP72E4DJ9B]
Jun 28 11:15:06 prometheus prometheus[13026]: ts=2017-06-28T10:15:06.893539624Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ1E3MH71VYZVAJS4GTD80T]
Jun 28 11:15:07 prometheus prometheus[13026]: ts=2017-06-28T10:15:07.303701059Z caller=compact.go:235 msg="compact blocks" blocks="[01BKQ1E3PB0V6QZQF8E6SQZ1CS 01BKQ2AF8J5DK2F1SE5X840578 01BKQ350WDBN1HS79DKKJA718Q]"
Jun 28 11:30:36 prometheus prometheus[13026]: ts=2017-06-28T10:30:36.847343542Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ2AF5X1SPAF8R34718Z7BC]
Jun 28 11:45:06 prometheus prometheus[13026]: ts=2017-06-28T10:45:06.895246313Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ350T470WZTNXWZ1EAQCPX]
Jun 28 12:00:36 prometheus prometheus[13026]: ts=2017-06-28T11:00:36.853000624Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ41CZYJNWRC448QXCYJ51A]
Jun 28 12:00:37 prometheus prometheus[13026]: ts=2017-06-28T11:00:37.432167673Z caller=compact.go:235 msg="compact blocks" blocks="[01BKQ41D1FSD560ATQZ97Y8KVY 01BKQ4VYPFQVF7GZMS4MA71GMX 01BKQ5RAVMGRGB3BAYA406JCPD]"
Jun 28 12:15:06 prometheus prometheus[13026]: ts=2017-06-28T11:15:06.856137447Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ4VYN1F1FA341BFMYDM9SN]
Jun 28 12:30:36 prometheus prometheus[13026]: ts=2017-06-28T11:30:36.843896527Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ5RASXQX01ZASBFGANSRNQ]
Jun 28 12:45:07 prometheus prometheus[13026]: ts=2017-06-28T11:45:07.085848071Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ6JWDCHDMJEKYXSVBJ54DX]
Jun 28 12:45:07 prometheus prometheus[13026]: ts=2017-06-28T11:45:07.499379374Z caller=compact.go:235 msg="compact blocks" blocks="[01BKQ6JWF847A5QAF8J5W1XM01 01BKQ7F8NB5Y2QSXVJKDK6QKF2 01BKQ89TGDY4J5KW5211AJXW9P]"
Jun 28 12:45:07 prometheus prometheus[13026]: ts=2017-06-28T11:45:07.925099835Z caller=compact.go:235 msg="compact blocks" blocks="[01BKQ351978G1YBYPK94YYHFPW 01BKQ5RBDRN99NX54EDWHF3XS0 01BKQ89TXB8W3X9GBCX273ZGN9]"
Jun 28 13:00:36 prometheus prometheus[13026]: ts=2017-06-28T12:00:36.841355359Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ7F8KZ2SQ7RQ981A1B9FBS]
Jun 28 13:15:07 prometheus prometheus[13026]: ts=2017-06-28T12:15:07.525356553Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ89TE18H0MRW5KQDC5N4QW]
Jun 28 13:30:36 prometheus prometheus[13026]: ts=2017-06-28T12:30:36.838954937Z caller=compact.go:235 msg="compact blocks" blocks=[01BKQ966DYXE38YVYVQQXN6JGZ]
Jun 28 13:30:37 prometheus prometheus[13026]: ts=2017-06-28T12:30:37.359462963Z caller=compact.go:235 msg="compact blocks" blocks="[01BKQ966F9JS9J1YTA9J9C2A3A 01BKQA0RR52N1935P8TZPY84AD 01BKQAX4966GQ2RVHR7SFFWZEN]"

Happy to provide more information, install compiled version from master, organise supervised access to the server, other mean to help you if this is not a PEBKAC.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jul 3, 2017

Thanks for reporting.
That's likely a querying issue, which was fixed in HEAD recently. We should do another release soon.

@thomas-mangin

This comment has been minimized.

Copy link
Author

thomas-mangin commented Jul 3, 2017

Thank you for looking into this issue. To make sure the next release does indeed address the problem, I have compiled and deployed today's HEAD. I will only be able to confirm if it is indeed working in a few days as prometheus reported:

ERRO[0000] Error opening memory series storage: found existing files in storage path that do not look like storage files compatible with this version of Prometheus; please delete the files in the storage path or choose a different storage path  source="main.go:192"

I also noticed the removal of 'w' and 'd' as option for storage.local.retention therefore I used 2400 hours, which is around 14weeks, which is around 1 quarter ...

root@prometheus:/opt/prometheus# /opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml --storage.local.retention 2000h --log.level=info
INFO[0000] Starting prometheus (version=1.7.0, branch=master, revision=43075d0215c94c1f57efeb725a7880e7ff16ae99)  source="main.go:88"
INFO[0000] Build context (go=go1.8.3, user=thomas@bgp, date=20170703-11:41:35)  source="main.go:89"
INFO[0000] Host details (Linux 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 prometheus (none))  source="main.go:90"
INFO[0000] Loading configuration file /opt/prometheus/prometheus.yml  source="main.go:252"
ERRO[0000] Error opening memory series storage: found existing files in storage path that do not look like storage files compatible with this version of Prometheus; please delete the files in the storage path or choose a different storage path  source="main.go:192"
@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jul 3, 2017

I was referring to HEAD of the dev-2.0 branch. The error message comes from the old storage in master.
The flags will support w and d was only added in dev-2.0 as well.

@thomas-mangin

This comment has been minimized.

Copy link
Author

thomas-mangin commented Jul 3, 2017

Thank you, changed branch to dev-2.0 - I should have noticed the 1.7.0 in the line ... 🤦‍♂️

Starting prometheus (version=2.0.0-alpha.3, branch=dev-2.0, revision=3845dfb7150bc19e704386ff50db2e07db3d1942)

Still got the same warning Error opening memory series storage: found existing files in storage path that do not look like storage files compatible with this version of Prometheus; ... when I provided the alpha3 data file (or I did something wrong - does not matter at this point).

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.