Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan compaction: too many open files #3875

Closed
ryanash999 opened this Issue Feb 21, 2018 · 3 comments

Comments

Projects
None yet
3 participants
@ryanash999
Copy link

ryanash999 commented Feb 21, 2018

What did you do?
Upgraded from 1.8.2 to v2.1.0

What did you expect to see?

We had remote read configured with both instances running locally. This migration plan worked fine in lower level environments but once we hit our upper environments with more data we are seeing this on numerous servers.

What did you see instead? Under which circumstances?

We are seeing the error below regarding 'plan compaction' and 'too many files open'

Environment

  • System information:

    Linux 3.10.0-693.11.6.el7.x86_64 x86_64

  • Prometheus version:

	prometheus, version 2.1.0 (branch: HEAD, revision: 85f23d82a045d103ea7f3c89a91fba4a93e6367a)
  build user:       root@6e784304d3ff
  build date:       20180119-12:01:23
  go version:       go1.9.2
  • Prometheus configuration file:
---
global:
  scrape_interval: 30s
  evaluation_interval: 30s
rule_files:
- "/etc/prometheus/alert.rules"
scrape_configs:
- job_name: clients
  file_sd_configs:
  - files:
    - "/etc/prometheus/targets/*.json"
alerting:
  alert_relabel_configs: []
  alertmanagers: []
remote_read:
  - url: http://localhost:8080/api/v1/read
remote_write: []
  • Logs:
Feb 21 14:37:13 foo systemd: Started Prometheus Monitoring framework.
Feb 21 14:37:13 foo systemd: Starting Prometheus Monitoring framework...
Feb 21 14:37:13 foo prometheus: level=info ts=2018-02-21T14:37:13.198972901Z caller=main.go:225 msg="Starting Prometheus" version="(version=2.1.0, branch=HEAD, revision=85f23d82a045d103ea7f3c89a91fba4a93e6367a)"
Feb 21 14:37:13 foo prometheus: level=info ts=2018-02-21T14:37:13.199086548Z caller=main.go:226 build_context="(go=go1.9.2, user=root@6e784304d3ff, date=20180119-12:01:23)"
Feb 21 14:37:13 foo prometheus: level=info ts=2018-02-21T14:37:13.199122061Z caller=main.go:227 host_details="(Linux 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Dec 28 14:23:39 EST 2017 x86_64 foo (none))"
Feb 21 14:37:13 foo prometheus: level=info ts=2018-02-21T14:37:13.199146388Z caller=main.go:228 fd_limits="(soft=1024, hard=4096)"
Feb 21 14:37:13 foo prometheus: level=info ts=2018-02-21T14:37:13.203382128Z caller=web.go:383 component=web msg="Start listening for connections" address=0.0.0.0:9090
Feb 21 14:37:13 foo prometheus: level=info ts=2018-02-21T14:37:13.203382182Z caller=main.go:499 msg="Starting TSDB ..."
Feb 21 14:37:22 foo prometheus: level=info ts=2018-02-21T14:37:22.58439072Z caller=main.go:509 msg="TSDB started"
Feb 21 14:37:22 foo prometheus: level=info ts=2018-02-21T14:37:22.585601734Z caller=main.go:585 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yaml
Feb 21 14:37:22 foo prometheus: level=info ts=2018-02-21T14:37:22.586627333Z caller=main.go:486 msg="Server is ready to receive web requests."
Feb 21 14:37:22 foo prometheus: level=info ts=2018-02-21T14:37:22.586711774Z caller=manager.go:59 component="scrape manager" msg="Starting scrape manager..."
Feb 21 14:38:22 foo prometheus: level=error ts=2018-02-21T14:38:22.58487739Z caller=db.go:265 component=tsdb msg="compaction failed" err="plan compaction: open /var/prometheus/v2_data_dir: too many open files"
Feb 21 14:39:24 foo prometheus: level=error ts=2018-02-21T14:39:24.585928979Z caller=db.go:265 component=tsdb msg="compaction failed" err="plan compaction: open /var/prometheus/: too many open files"
  • Additional information:
# df -h /var/prometheus
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/vg01-lv_prometheus  299G  161G  139G  54% /var/prometheus

We have 35d retention

I have tried setting '--storage.tsdb.max-block-duration=1d'

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 21, 2018

You need to increase your file ulimit.

It makes more sense to ask questions like this on the prometheus-users mailing list rather than in a GitHub issue. On the mailing list, more people are available to potentially respond to your question, and the whole community can benefit from the answers provided.

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Feb 21, 2018

one more good entry for the FAQ :)
https://github.com/prometheus/prometheus/wiki/FAQ#error-logs-for-too-many-files-open

@brian-brazil haven't decided what to with the FAQ page yet. The wiki is so convenient to edit that I will use for now.

also seems the the FAQ on the website and the FAQ for github issues targets different audience.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.