Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getting "too many open files error" in splunk #14

Open
kyaparla opened this issue Jul 24, 2019 · 1 comment
Open

getting "too many open files error" in splunk #14

kyaparla opened this issue Jul 24, 2019 · 1 comment

Comments

@kyaparla
Copy link

kyaparla commented Jul 24, 2019

I see following error in splunk, even after updating file descriptor limit to much higher number.

http: Accept error: accept tcp 0.0.0.0:8098: accept4: too many open files; retrying in 20ms.

Prometheus data in splunk is not continuous, which I think is due to above problem. And there are several gaps and seeing data at some intervals.

@ltmon
Copy link
Collaborator

ltmon commented Jul 29, 2019

It would probably take a lot of connections to exceed to the number of available fds on the system. Usually we would only get this if the connections were never being released after processing, which shouldn't be the case.

How many Prometheus systems are using this as a remote write target? You could try adjusting maxClients up or down and see if that helps.

Otherwise can you let me know if you have a set value for net.ipv4.tcp_tw_recycle and net.ipv4.tcp_tw_reuse in your sysctl.conf?

When you are in this state, does netstat show a lot of TIME_WAIT connections?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants