Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus generated too many same files when do compaction #5137

Closed
eahydra opened this Issue Jan 25, 2019 · 8 comments

Comments

Projects
None yet
3 participants
@eahydra
Copy link

eahydra commented Jan 25, 2019

Proposal

Use case. Why is this important?
Prometheus generated too many same files when do compaction.

Bug Report

What did you do?
I have got alert that disk space cost 90%, and found that the prometheus tsdb used too many space.
There are many same files but different parent directory name, and the meta.json have the same text.

What did you expect to see?
Don't generate so many same files

What did you see instead? Under which circumstances?
Too many same files when compaction.

Environment

  • System information:

    the internal version but based on CentOS 7

  • Prometheus version:
    prometheus, version 2.5.0 (branch: release/20181129-11-33, revision: 67dc912)
    build user: admin@rs7h13559.et2sqa
    build date: 20181129-03:34:17
    go version: go1.11

  • Prometheus configuration file:

global:
  scrape_interval:     30s
  evaluation_interval: 30s

scrape_configs:
- job_name: 'inspector'
  scrape_interval: 30s
  scheme: http
  metrics_path: "/metrics"
  static_configs:
      - targets: ["target1:8070", "target2:8070"]

  • Logs:

Lost the logs

@codesome

This comment has been minimized.

Copy link
Member

codesome commented Jan 25, 2019

During compaction, Prometheus can take up to 2x the current disk space being used, because at first compaction creates the new blocks then deletes the old ones.

Is your disk usage still high even after compaction has ended?

@eahydra

This comment has been minimized.

Copy link
Author

eahydra commented Jan 25, 2019

During compaction, Prometheus can take up to 2x the current disk space being used, because at first compaction creates the new blocks then deletes the old ones.

Is your disk usage still high even after compaction has ended?

In order to ensure that other service programs on this machine are not affected, I first cleared the receipt. Later I will try to reproduce on a large-capacity machine.

@eahydra

This comment has been minimized.

Copy link
Author

eahydra commented Jan 28, 2019

@codesome Hi, I have encountered this problem again.

the TSDB directory like this:

drwxr-xr-x 3 root root 4096 Jan 26 17:00 01D24P6CN8MZGWH5M75ASVAAY5
drwxr-xr-x 3 root root 4096 Jan 26 23:00 01D25ASK8QG2F0RPA2QK05K1K4
drwxr-xr-x 3 root root 4096 Jan 27 05:00 01D25ZCJ4AG7EGY95XGQKJXVTG
drwxr-xr-x 3 root root 4096 Jan 27 05:00 01D25ZCRKWEJ2G9P0Z32DFD18F
drwxr-xr-x 3 root root 4096 Jan 27 05:00 01D25ZCWNNSAARFJ2DGCSAPHS1
drwxr-xr-x 3 root root 4096 Jan 27 07:00 01D26689CB6CTNMDD73EJA2PRS
drwxr-xr-x 3 root root 4096 Jan 27 07:01 01D266A7DTB30924ZESRQXYYE9
drwxr-xr-x 3 root root 4096 Jan 27 07:02 01D266C55JE7HVZP5F1EF8PGZS
drwxr-xr-x 3 root root 4096 Jan 27 07:03 01D266E2W5PDG23WZCDVSDRX3V
drwxr-xr-x 3 root root 4096 Jan 27 07:04 01D266G0J9ZDNDFK7CVWJH0775
drwxr-xr-x 3 root root 4096 Jan 27 07:05 01D266HY8TAMQDMK80ACNP1ETH
drwxr-xr-x 3 root root 4096 Jan 27 07:06 01D266KVXFCZVV86KZRH4SKF4V
drwxr-xr-x 3 root root 4096 Jan 27 07:07 01D266NSNX3JNAE8KPCRNSBX7S
drwxr-xr-x 3 root root 4096 Jan 27 07:08 01D266QQAQF58M4ABWJHY162CV
drwxr-xr-x 3 root root 4096 Jan 27 07:09 01D266SMYGZWYDYCSB1X140G78
drwxr-xr-x 3 root root 4096 Jan 27 07:10 01D266VJKQYTP5FX96CTJWM06M
....
....
drwxr-xr-x 3 root root 4096 Jan 28 16:45 01D29T5Q1ZHCK1JPEHKB3DM415
drwxr-xr-x 3 root root 4096 Jan 28 16:46 01D29T7NCC31YTZ8S3AKAWRP0R
drwxr-xr-x 3 root root 4096 Jan 28 16:48 01D29T9KKHWZP5B3Y4HCXBVPQE
drwxr-xr-x 3 root root 4096 Jan 28 16:49 01D29TBHQ8B5MH38JMJ1QCZXWS
drwxr-xr-x 3 root root 4096 Jan 28 16:50 01D29TDG6EPQHEEK9QS320KCYA
drwxr-xr-x 3 root root 4096 Jan 28 16:51 01D29TFEM4SZ6J054C3486QZ8D
drwxr-xr-x 3 root root 4096 Jan 28 16:52 01D29THCZSP88DYMBQ08REE5TN
drwxr-xr-x 3 root root 4096 Jan 28 16:53 01D29TKB7M7V7WTHTPKVV88CY1
drwxr-xr-x 3 root root 4096 Jan 28 16:54 01D29TN9DXXRBW2F7K0RX7JX23
drwxr-xr-x 3 root root 4096 Jan 28 16:55 01D29TQ7NAW5N9TYSQ5C4TBV9J
drwxr-xr-x 3 root root 4096 Jan 28 16:56 01D29TS5VS3JPYMSE19ZC0XJF4
drwxr-xr-x 3 root root 4096 Jan 28 16:57 01D29TV452KKF2ZD4J46N9R3VV
drwxr-xr-x 3 root root 4096 Jan 28 16:58 01D29TX2DBF4H60E5P76E3J2YW
-rw-r--r-- 1 root root    0 Dec 30 18:16 lock
drwxr-xr-x 3 root root 4096 Jan 28 16:49 wal
@codesome

This comment has been minimized.

Copy link
Member

codesome commented Jan 28, 2019

It is difficult to figure out the problem by looking at just the directory name and its time.

I don't think meta files can be same, that would mean overlapping blocks, which should be flagged by Prometheus.

Can you give some more info?

  • meta files in all the block directories.
  • The flags used while running Prometheus.
@eahydra

This comment has been minimized.

Copy link
Author

eahydra commented Jan 28, 2019

  • meta files in all the block directories.

I have deleted these files.

  • The flags used while running Prometheus.
--config.file /data/prometheus.yml --web.listen-address=:10400 --storage.tsdb.path=/data/prometheus--storage.tsdb.retention=15d
@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Jan 28, 2019

It smells like #4392 and #5120.

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Feb 1, 2019

Thanks for reporting! I'm closing it as this is a duplicate of #5135. The tsdb fix is prometheus/tsdb#512 (still under discussion).

@eahydra

This comment has been minimized.

Copy link
Author

eahydra commented Feb 1, 2019

Thanks for reporting! I'm closing it as this is a duplicate of #5135. The tsdb fix is prometheus/tsdb#512 (still under discussion).

Good job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.