getting "too many open files error" in splunk #14

kyaparla · 2019-07-24T12:59:44Z

I see following error in splunk, even after updating file descriptor limit to much higher number.

http: Accept error: accept tcp 0.0.0.0:8098: accept4: too many open files; retrying in 20ms.

Prometheus data in splunk is not continuous, which I think is due to above problem. And there are several gaps and seeing data at some intervals.

ltmon · 2019-07-29T03:27:40Z

It would probably take a lot of connections to exceed to the number of available fds on the system. Usually we would only get this if the connections were never being released after processing, which shouldn't be the case.

How many Prometheus systems are using this as a remote write target? You could try adjusting maxClients up or down and see if that helps.

Otherwise can you let me know if you have a set value for net.ipv4.tcp_tw_recycle and net.ipv4.tcp_tw_reuse in your sysctl.conf?

When you are in this state, does netstat show a lot of TIME_WAIT connections?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getting "too many open files error" in splunk #14

getting "too many open files error" in splunk #14

kyaparla commented Jul 24, 2019 •

edited

ltmon commented Jul 29, 2019

getting "too many open files error" in splunk #14

getting "too many open files error" in splunk #14

Comments

kyaparla commented Jul 24, 2019 • edited

ltmon commented Jul 29, 2019

kyaparla commented Jul 24, 2019 •

edited