Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upToo many open sockets and fds on reload #3873
Comments
This comment has been minimized.
This comment has been minimized.
|
Can you share the complete Prometheus logs after restarting? There should be a line showing the actual fd limits. |
This comment has been minimized.
This comment has been minimized.
|
@simonpasquier damn, you're right, it only says
|
This comment has been minimized.
This comment has been minimized.
|
Thanks @simonpasquier! |
andreasnuesslein
closed this
Feb 21, 2018
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 22, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
lock
bot
locked and limited conversation to collaborators
Mar 22, 2019
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
andreasnuesslein commentedFeb 21, 2018
•
edited
What did you do?
Reload, instead of restart Prometheus (
systemctl reload prometheus)What did you see instead? Under which circumstances?
For days I've been having this issue where I had:
err="write /var/app/prometheus/data/wal/006383: file already closed"andmsg="Error sending alert" err="Post http://localhost:9093/api/v1/alerts: dial tcp 127.0.0.1:9093: socket: too many open files"andGet XX/metrics: dial tcp: lookup XX on 8.8.8.8:53: dial udp 8.8.8.8:53: socket: too many open filesconstantly. It would run smooth for a few hours and then bam, a gazillion of those errors.
I've been looking through the issues here on github and on google groups and tried a few of the fixes mentioned there.
I think I finally found the problem: for some reason
reloadinginstead ofrestartingprometheus seems to completely mess up the sockets and filedescriptors.Example:
I noticed this: #3446
but I already tuned the ulimit to insanely high values and that's not it. My
ls /proc/<>/fd |wcwas usually around 800 anyways.Environment
System information:
Linux 4.4.0-112-generic x86_64
Prometheus version:
prometheus, version 2.2.0-rc.0 (branch: HEAD, revision: 1fe05d4)
build user: root@f7abb25edc70
build date: 20180213-11:40:47
go version: go1.9.2
Alertmanager version:
alertmanager, version 0.14.0 (branch: HEAD, revision: 30af4d051b37ce817ea7e35b56c57a0e2ec9dbb0)
build user: root@37b6a49ebba9
build date: 20180213-08:16:42
go version: go1.9.2
Obviously I'm not reloading prometheus anymore now :)