Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: Accept error: accept tcp [::]:9115: accept4: too many open files #288

Closed
alvaroaleman opened this issue Jan 21, 2018 · 11 comments
Closed
Labels

Comments

@alvaroaleman
Copy link

Host operating system: output of uname -a

Linux 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

blackbox_exporter version: output of blackbox_exporter -version

blackbox_exporter, version 0.10.0 (branch: HEAD, revision: 75681e3c4f051e24269a132e8ec19517118e1586)
  build user:       root@0334aae810a6
  build date:       20171009-14:25:49
  go version:       go1.9.1

What is the blackbox.yml module config.

modules:
  http_2xx_v4:
    prober: http
    http:
      preferred_ip_protocol: "ip4"
  http_2xx_v6:
    prober: http
    http:
      preferred_ip_protocol: "ip6"
  http_2xx_ba_v4:
    prober: http
    http:
      preferred_ip_protocol: "ip4"
      basic_auth:
        username: 'someuser'
        password: 'somepass'
  http_2xx_ba_v6:
    prober: http
    http:
      preferred_ip_protocol: "ip6"
      basic_auth:
        username: 'someuser'
        password: 'somepass'
  http_post_2xx:
    prober: http
    http:
      method: POST
  tcp_connect:
    prober: tcp
  pop3s_banner:
    prober: tcp
    tcp:
      query_response:
      - expect: "^+OK"
      tls: true
      tls_config:
        insecure_skip_verify: false
  ssh_banner:
    prober: tcp
    tcp:
      query_response:
      - expect: "^SSH-2.0-"
  icmp:
    prober: icmp

What is the prometheus.yml scrape config.

    scrape_configs:
    - job_name: 'blackbox_v4'
      metrics_path: /probe
      params:
        module:
          - http_2xx_v4
      static_configs:
        - targets: 
            - https://list
            - https://of
            - https://hosts
      relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: 'blackbox_exporter_host:9115'
    - job_name: 'blackbox_v6'
      metrics_path: /probe
      params:
        module:
          - http_2xx_v6
      static_configs:
        - targets:
            - https://some
            - https://hosts
      relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: 'blackbox_exporter_host:9115'
    - job_name: 'blackbox_ba_v4'
      metrics_path: /probe
      params:
        module:
          - http_2xx_ba_v4
      static_configs:
        - targets:
          - https://some
          - https://hosts
      relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: 'blackbox_exporter_host:9115'
    - job_name: 'blackbox_ba_v6'
      metrics_path: /probe
      params:
        module:
          - http_2xx_ba_v6
      static_configs:
        - targets:
          - https://some
          - https://hosts
      relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: 'blackbox_exporter_host:9115'
    - job_name: 'blackbox_ssh_v6'
      metrics_path: /probe
      params:
        module:
          - ssh_banner
      static_configs:
        - targets:
            - 'ahost:2222'
      relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: 'blackbox_exporter_host:9115'

What did you do that produced an error?

Run the blackbox_exporter for about 50 days

What did you expect to see?

No errors

What did you see instead?

http: Accept error: accept tcp [::]:9115: accept4: too many open files; retrying in 1s

Restarting the blackbox_exporter fixes the issue, however that is not a real solution. Maybe noteworthy is that my Prometheus scrapes the blackbox_exporter via an IPv6 address.

@brian-brazil
Copy link
Contributor

Do you know which probe is causing this?

@alvaroaleman
Copy link
Author

Unfortunately not, but wild guess, I'd say there is a higher chance it is caused by the ssh prober since that is the most exotic one, if it was caused by one of the others there is a high chance someone else already experienced the issue.

@brian-brazil
Copy link
Contributor

The process_open_fds will help spot an FD leak.

@alvaroaleman
Copy link
Author

I am not sure I get your point. process_open_fds will tall me that there is a leak, but not what codepath/probe in the blackbox_exporter caused it.

@brian-brazil
Copy link
Contributor

If you only send one type of probe to a blackbox exporter at a time, you could help narrow it down.

@alvaroaleman
Copy link
Author

So after I hit the max fd limit another two times I took the time to investigate. The fd leak seems to be caused by this job, because if I leave it on as the only job that uses the blackbox_exporter it happens, if I turn it off and keep all other jobs running it does not happen:

          #    - job_name: 'blackbox_ba_v4'
          #      metrics_path: /probe
          #      params:
          #        module:
          #          - http_2xx_ba_v4
          #      static_configs:
          #        - targets:
          #          - https://a.com
          #          - https://b.com
          #          - https://c.com
          #          - https://d.com
          #      relabel_configs:
          #        - source_labels: [__address__]
          #          target_label: __param_target
          #        - source_labels: [__param_target]
          #          target_label: instance
          #        - target_label: __address__
          #          replacement: '192.168.0.41:9115'

The module config looks like this:

  http_2xx_ba_v4:
    prober: http
    http:
      preferred_ip_protocol: "ip4"
      basic_auth:
        username: 'a_user'
        password: 'a_pass'

Other stuff that may or may not be of relevance:

  • Prometheus only talks via IPv6 to the blackbox_exporter, the 192.168.0.41:9115 above is a haproxy that does ipv4 to ipv6 translation
  • There is another job that is identical to the above one except it uses ipv6, it causes no fd leak
  • There is another job that is identical to the above one except it uses no basic_auth, it causes no fd leak

@brian-brazil
Copy link
Contributor

That narrows things down a good bit. Does it seem to go with failed probes, successful probes, or both?

@alvaroaleman
Copy link
Author

Only happens with successful probes, if they fail the open_fds stay constant.

@brian-brazil
Copy link
Contributor

I can't reproduce this. What are the fds which are being leaked?

@alvaroaleman
Copy link
Author

So, I couldn't reproduce this locally either but perfectly well on the actual setup.

Found out if this only happens when using that ipv4-to-ipv6 translating haproxy.

What resolved the issue was upgrading the haproxy from 1.8.0 to 1.8.4 - I guess it had a bug that caused it to not close connections under some circumstances which got triggered by using the http_2xx_ba_v4 module.

Thanks a lot for your help!

@Bischoff
Copy link

we just reproduced it for the postgres exporter , and saw ulimits was 1024 while the exporter opened about one thousand sockets

hope that helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants