Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error compaction failed - prometheus ate disk and memory #4354

Closed
dmitriy-lukyanchikov opened this Issue Jul 5, 2018 · 3 comments

Comments

Projects
None yet
3 participants
@dmitriy-lukyanchikov
Copy link

dmitriy-lukyanchikov commented Jul 5, 2018

Bug Report

I use Prometheus 2.3.1 but suddenly i notice out of memory error. After short investigation i notice that wal directory size is more than 63GB usually its less than 5GB. Log shows an error, this line appear more than 200 times for period of 1-5 minutes for almost 24 hours

component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"

After restarting stopped prometheus compaction goes well and after couple of hours prometheus start to work in normally

What did you expect to see?
i expect not to see this error
What did you see instead? Under which circumstances?

Environment
dedicated server with ubuntu 16.04.2

  • System information:

Linux 4.8.0-42-generic x86_64

  • Prometheus version:
prometheus, version 2.3.1 (branch: HEAD, revision: 188ca45bd85ce843071e768d855722a9d9dabe03)
  build user:       root@82ef94f1b8f7
  build date:       20180619-15:56:22
  go version:       go1.10.3
  • Prometheus configuration file:
    runtime flags
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention=21d'
      - '--web.enable-admin-api'
      - '--storage.tsdb.max-block-duration=1d'
      - '--storage.tsdb.min-block-duration=2h'
  • Logs:
level=info ts=2018-07-04T09:01:34.889427586Z caller=head.go:348 component=tsdb msg="head GC completed" duration=13.182340309s
level=info ts=2018-07-04T09:01:54.363227435Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=19.473733855s
level=info ts=2018-07-04T09:03:16.078175119Z caller=compact.go:347 component=tsdb msg="compact blocks" count=3 mint=1530662400000 maxt=1530684000000 ulid=01CHJ8DTZHM5HYAC100XDEWN1G sources="[01CHHKPHT3WSRWBRW0WCXDY4AD 01CHHTHBXM2BWG43PHG6RCP7BJ 01CHJ1D2XES1458XJCWHS527AV]"
level=error ts=2018-07-04T11:00:23.687882381Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:01:29.968016703Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:02:37.525995318Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:03:49.929893783Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:05:11.971452753Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:06:36.943008251Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:08:17.242492107Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:10:29.578448951Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:12:37.80025655Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:14:48.391295713Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:16:59.371132223Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:19:13.284698435Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:21:28.552850736Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:23:41.634100091Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:25:55.699553549Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:28:12.070165119Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:30:30.033660667Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:32:39.815838391Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:35:00.475075179Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:37:17.528911582Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:39:45.481329175Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
level=error ts=2018-07-04T11:42:05.332082295Z caller=db.go:277 component=tsdb msg="compaction failed" err="persist head block: write compaction: add series: symbol entry for \"/rootfs/run/docker/netns/30fc5eaffef7\" does not exist"
@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Jul 6, 2018

Thanks for the report!
@krasi-georgiev probably another case for the tsdb scan tool?

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Jul 6, 2018

haven't seen this one before.
seems that persisting the in memory bock fails and which prevent from clear the WAL files.

@dmitriy-lukyanchikov do you think you can send me the WAL privately at kgeorgie at redhat.com
64gb would be too much, but the WAL dir should include many files. Can you try to replicate with just one or few files form that dir and send me those. Maybe just start moving files out of the dir untill you no longer see the error and find out the offending file.

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Nov 21, 2018

this is the same as #4757 which was caused by unbounded label cardinality. Misconfigured exporter.
Feel free to reopen if you think the cause for this one is different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.