Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluentd on K8s stopped flushing logs to Elastic #732

Closed
1 task done
nanorobocop opened this issue Mar 16, 2020 · 5 comments
Closed
1 task done

Fluentd on K8s stopped flushing logs to Elastic #732

nanorobocop opened this issue Mar 16, 2020 · 5 comments

Comments

@nanorobocop
Copy link

nanorobocop commented Mar 16, 2020

(check apply)

Problem

We have Fluentd running in Daemonset (using fluentd-kubernetes-daemonset).
At some point almost all instances of Fluentd stop flushing their queue.

screenshot

I checked related issues like #525 and rancher/rancher#25819 and #600, but no luck.

Elastic is behind load balancer, so we set following setting:

      reload_connections false
      reconnect_on_error true
      reload_on_failure false

No errors or warning messages appears in fluentd logs.

I checked lsof: buffer and queue files are opened by ruby process:

root@fluentd-5m67p:/home/fluent# ps auxf
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       6811  0.0  0.0   9744  2528 ?        Ss   08:53   0:00 bash
root       6818  0.0  0.0  13244  1784 ?        R+   08:53   0:00  \_ ps auxf
root          1  0.0  0.0   8288   168 ?        Ss   Mar12   0:07 tini -- /fluentd/entrypoint.sh
root          7  0.0  0.0 147120 59736 ?        Sl   Mar12   1:34 ruby /fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd -c /fluentd/etc/fluentd.conf -p /fluentd/plugins --gemfile /fluentd/Gemfile -r /fluentd
root         19  0.2  0.1 391980 92724 ?        Sl   Mar12  16:08  \_ /usr/local/bin/ruby -Eascii-8bit:ascii-8bit /fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd -c /fluentd/etc/fluentd.conf -p /fluentd/plu
root@fluentd-5m67p:/home/fluent# lsof | grep /var/log/fluentd | head
ruby      19                root    6r      REG              253,2      702 260204793 /var/log/fluentd-es.buffer/buffer.q5a0ed281c815ce1d996b701324ef2f34.log
ruby      19                root   16u      REG              253,2      415   1041535 /var/log/fluentd-rancher-rke.pos
ruby      19                root   20r      REG              253,2  3725271 260126777 /var/log/fluentd-es.buffer/buffer.q5a0ed2a2be44d342cc4ace0849b87681.log
ruby      19                root   21r      REG              253,2      185 260204768 /var/log/fluentd-es.buffer/buffer.q5a0ed2a2be44d342cc4ace0849b87681.log.meta
ruby      19                root   22u      REG              253,2     4956 260204774 /var/log/fluentd-es.buffer/buffer.b5a0ed2a2be7851b8fed18bb5745d3821.log
ruby      19                root   24u      REG              253,2     1581   1041536 /var/log/fluentd-k8s.pos
ruby      19                root   29r      REG              253,2     9154 260204777 /var/log/fluentd-es.buffer/buffer.q5a0ed260efbd77103296f04691540f76.log
ruby      19                root   30r      REG              253,2      188 260204782 /var/log/fluentd-es.buffer/buffer.q5a0ed260efbd77103296f04691540f76.log.meta
ruby      19                root   32r      REG              253,2    18877 260204786 /var/log/fluentd-es.buffer/buffer.q5a0ed263ccf18519cdc1081e7bdf2d8b.log
ruby      19                root   33r      REG              253,2      183 260204787 /var/log/fluentd-es.buffer/buffer.q5a0ed263ccf18519cdc1081e7bdf2d8b.log.meta

Growing queue (there're much more files on other instances):

root@fluentd-5m67p:/home/fluent# ls -laht /var/log/fluentd-es.buffer/
total 22M
-rw-r--r--   1 root root 1.8M Mar 16 09:11 buffer.b5a0f3a0610eeb354b96a33d8f58bb903.log
-rw-r--r--   1 root root  185 Mar 16 09:11 buffer.b5a0f3a0610eeb354b96a33d8f58bb903.log.meta
-rw-r--r--   1 root root 841K Mar 16 09:11 buffer.b5a0ed2a3b19cc6bd31a6def5ad27d677.log
-rw-r--r--   1 root root  190 Mar 16 09:11 buffer.b5a0ed2a3b19cc6bd31a6def5ad27d677.log.meta
-rw-r--r--   1 root root 8.4M Mar 16 09:11 buffer.b5a0ed281c85b0bff32ec4dd6f6a91e06.log
-rw-r--r--   1 root root  222 Mar 16 09:11 buffer.b5a0ed281c85b0bff32ec4dd6f6a91e06.log.meta
-rw-r--r--   1 root root 183K Mar 16 09:11 buffer.b5a0ed2e4f76f4d08299d3adb6cedeb29.log
-rw-r--r--   1 root root   83 Mar 16 09:11 buffer.b5a0ed2e4f76f4d08299d3adb6cedeb29.log.meta
-rw-r--r--   1 root root 1.9M Mar 16 09:10 buffer.b5a0ed2d32a2502615f507638c8a42711.log
-rw-r--r--   1 root root  208 Mar 16 09:10 buffer.b5a0ed2d32a2502615f507638c8a42711.log.meta
-rw-r--r--   1 root root  17K Mar 16 08:29 buffer.b5a0ede80392ff3bddf11cf8e709a3259.log
-rw-r--r--   1 root root  205 Mar 16 08:29 buffer.b5a0ede80392ff3bddf11cf8e709a3259.log.meta
-rw-r--r--   1 root root 4.9K Mar 16 08:12 buffer.b5a0ed2a2be7851b8fed18bb5745d3821.log
-rw-r--r--   1 root root   82 Mar 16 08:12 buffer.b5a0ed2a2be7851b8fed18bb5745d3821.log.meta
-rw-r--r--   1 root root  128 Mar 16 08:12 buffer.b5a0f4612f014c24904ea00e1314342c3.log
-rw-r--r--   1 root root   81 Mar 16 08:12 buffer.b5a0f4612f014c24904ea00e1314342c3.log.meta
drwxr-xr-x   2 root root 4.0K Mar 16 08:12 .
drwxr-xr-x. 26 root root 4.0K Mar 16 08:04 ..
-rw-r--r--   1 root root 3.6M Mar 16 07:18 buffer.q5a0f064c4c6191fe52377e86b93ac5c1.log
-rw-r--r--   1 root root  185 Mar 16 07:18 buffer.q5a0f064c4c6191fe52377e86b93ac5c1.log.meta
-rw-r--r--   1 root root 3.6M Mar 16 03:27 buffer.q5a0ed2a2be44d342cc4ace0849b87681.log
-rw-r--r--   1 root root  185 Mar 16 03:27 buffer.q5a0ed2a2be44d342cc4ace0849b87681.log.meta
-rw-r--r--   1 root root   81 Mar 15 23:36 buffer.q5a0ed281c815ce1d996b701324ef2f34.log.meta
-rw-r--r--   1 root root  702 Mar 15 23:36 buffer.q5a0ed281c815ce1d996b701324ef2f34.log
-rw-r--r--   1 root root  183 Mar 15 23:35 buffer.q5a0ed263ccf18519cdc1081e7bdf2d8b.log.meta
-rw-r--r--   1 root root  188 Mar 15 23:35 buffer.q5a0ed260efbd77103296f04691540f76.log.meta
-rw-r--r--   1 root root  19K Mar 15 23:35 buffer.q5a0ed263ccf18519cdc1081e7bdf2d8b.log
-rw-r--r--   1 root root 9.0K Mar 15 23:35 buffer.q5a0ed260efbd77103296f04691540f76.log
-rw-r--r--   1 root root  206 Mar 15 23:27 buffer.q5a0ed096f5d6c33c5cb94a686e5a54c4.log.meta
-rw-r--r--   1 root root 6.8K Mar 15 23:26 buffer.q5a0ed096f5d6c33c5cb94a686e5a54c4.log

There're connections to destination host 10.32.150.166:12000, but no traffic:

root@fluentd-5m67p:/home/fluent# netstat -ntp
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 10.42.28.8:60316        10.32.150.166:12000     ESTABLISHED 19/ruby
tcp        0      0 10.42.28.8:8080         10.42.12.198:45760      TIME_WAIT   -
tcp        0      0 10.42.28.8:33176        10.32.150.166:12000     ESTABLISHED 19/ruby

Steps to replicate

root@fluentd-5m67p:/fluentd# bundle install --binstubs
[DEPRECATED] The --binstubs option will be removed in favor of `bundle binstubs`
Using rake 13.0.1
Using public_suffix 4.0.3
Using addressable 2.7.0
Using bundler 2.1.2
Using concurrent-ruby 1.1.6
Using cool.io 1.6.0
Using unf_ext 0.0.7.6
Using unf 0.1.4
Using domain_name 0.5.20190701
Using multi_json 1.14.1
Using elasticsearch-api 6.8.1
Using multipart-post 2.1.1
Using faraday 0.17.3
Using elasticsearch-transport 6.8.1
Using elasticsearch 6.8.1
Using excon 0.72.0
Using ffi 1.12.2
Using ffi-compiler 1.0.1
Using http_parser.rb 0.6.0
Using msgpack 1.3.3
Using sigdump 0.2.4
Using serverengine 2.2.1
Using strptime 0.2.3
Using tzinfo 2.0.1
Using tzinfo-data 1.2019.3
Using yajl-ruby 1.4.1
Using fluentd 1.9.2
Using fluent-config-regexp-type 1.0.0
Using fluent-plugin-concat 2.4.0
Using fluent-plugin-detect-exceptions 0.0.13
Using fluent-plugin-elasticsearch 4.0.4
Using fluent-plugin-grok-parser 2.6.1
Using fluent-plugin-json-in-json-2 1.0.2
Using http-cookie 1.0.3
Using http-form_data 2.2.0
Using http-parser 1.2.1
Using http 4.3.0
Using recursive-open-struct 1.1.0
Using http-accept 1.7.0
Using mime-types-data 3.2019.1009
Using mime-types 3.3.1
Using netrc 0.11.0
Using rest-client 2.1.0
Using kubeclient 4.6.0
Using lru_redux 1.1.0
Using fluent-plugin-kubernetes_metadata_filter 2.3.0
Using fluent-plugin-multi-format-parser 1.0.0
Using quantile 0.2.1
Using prometheus-client 0.9.0
Using fluent-plugin-prometheus 1.6.1
Using fluent-plugin-record-modifier 2.0.1
Using fluent-plugin-rewrite-tag-filter 2.2.0
Using systemd-journal 1.3.3
Using fluent-plugin-systemd 1.0.2
Using oj 3.8.1
Bundle complete! 15 Gemfile dependencies, 55 gems now installed.
Bundled gems are installed into `./vendor/bundle`

fluentd.conf

<source>
  @type prometheus
  bind 0.0.0.0
  port "#{ENV['FLUENTD_PROMETHEUS_PORT']}"
</source>

<source>
  @type prometheus_monitor
  <labels>
    cluster "#{ENV['CLUSTER_NAME']}"
    node "#{ENV['NODE_NAME']}"
    deployment es
  </labels>
</source>

<source>
  @type prometheus_output_monitor
  <labels>
    cluster "#{ENV['CLUSTER_NAME']}"
    node "#{ENV['NODE_NAME']}"
    deployment es
  </labels>
</source>

<source>
  @type prometheus_tail_monitor
  <labels>
    cluster "#{ENV['CLUSTER_NAME']}"
    node "#{ENV['NODE_NAME']}"
    deployment es
  </labels>
</source>


# logs of containers maintained by k8s
<source>
  @type tail
  read_from_head true
  path /var/log/containers/*.log
  pos_file /var/log/fluentd-k8s.pos
  time_format %Y-%m-%dT%H:%M:%S.%L
  tag k8s.*
  format json
</source>


# logs of containers managed by RKE
<source>
  @type tail
  read_from_head true
  path /var/lib/rancher/rke/log/*.log
  pos_file /var/log/fluentd-rancher-rke.pos
  time_format %Y-%m-%dT%H:%M:%S.%L
  tag rancher-rke.*
  format json
</source>

<filter k8s.**>
  @type kubernetes_metadata
  kubernetes_url https://kubernetes.default.svc:443
  ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file /var/run/secrets/kubernetes.io/serviceaccount/token
</filter>

<filter **>
  @type record_transformer
  <record>
    projectName es
    projectVersion "1.0.0"
    logType ${tag_parts[0]}
    file ${tag}
  </record>
</filter>

<match **>
  @type elasticsearch
  host es.example.com
  port 12000
  user user
  password ***
  scheme https
        ssl_verify false
  ssl_version TLSv1_2
        logstash_format true
  reload_connections false
  reconnect_on_error true
  reload_on_failure false
  <buffer>
    @type file
    path /var/log/fluentd-es.buffer
    flush_at_shutdown true
    flush_mode interval
    flush_interval 60s
    flush_thread_count 2
    chunk_limit_size 10MB
    chunk_limit_records 10000
  </buffer>
  flatten_hashes true
  flatten_hashes_separator _
  @log_level debug

Expected Behavior or What you need to ask

...

Using Fluentd and ES plugin versions

  • OS version: CentOS Linux release 7.7.1908 (Core)
  • Kubernetes: 1.10.3-rancher1.1+218721003ebdda
  • Fluentd:
    • fluent/fluentd-kubernetes-daemonset:v1.9.2-debian-elasticsearch6-1.0
    • fluentd 1.9.2
  • ES plugin 4.0.3
    • paste boot log of fluentd or td-agent
w022571811645m:~ JP25060$ kubectl -n efk logs   fluentd-5m67p   | head -n 500
2020-03-12 10:10:23 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluentd.conf"
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-concat' version '2.4.0'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-detect-exceptions' version '0.0.13'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '4.0.4'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-grok-parser' version '2.6.1'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-json-in-json-2' version '1.0.2'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '2.3.0'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-multi-format-parser' version '1.0.0'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-prometheus' version '1.6.1'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.0.1'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.2.0'
2020-03-12 10:10:23 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.2'
2020-03-12 10:10:23 +0000 [info]: gem 'fluentd' version '1.9.2'
2020-03-12 10:10:23 +0000 [debug]: 'host es.example.com' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'host: es.example.com' doesn't have tag placeholder
2020-03-12 10:10:23 +0000 [debug]: 'index_name fluentd' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'index_name: fluentd' doesn't have tag placeholder
2020-03-12 10:10:23 +0000 [debug]: 'template_name ' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'template_name: ' doesn't have tag placeholder
2020-03-12 10:10:23 +0000 [debug]: 'logstash_prefix logstash' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'logstash_prefix: logstash' doesn't have tag placeholder
2020-03-12 10:10:23 +0000 [debug]: 'deflector_alias ' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'deflector_alias: ' doesn't have tag placeholder
2020-03-12 10:10:23 +0000 [debug]: 'application_name default' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'application_name: default' doesn't have tag placeholder
2020-03-12 10:10:23 +0000 [debug]: Need substitution: false
2020-03-12 10:10:23 +0000 [debug]: 'host_placeholder es.example.com' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'host_placeholder: es.example.com' doesn't have tag placeholder
2020-03-12 10:10:23 +0000 [warn]: define <match fluent.**> to capture fluentd logs in top level is deprecated. Use <label @FLUENT_LOG> instead
2020-03-12 10:10:23 +0000 [info]: using configuration file: <ROOT>
  <source>
    @type prometheus
    bind "0.0.0.0"
    port 8080
  </source>
  <source>
    @type prometheus_monitor
    <labels>
      cluster local
      node worker007
      deployment es
    </labels>
  </source>
  <source>
    @type prometheus_output_monitor
    <labels>
      cluster local
      node worker007
      deployment es
    </labels>
  </source>
  <source>
    @type prometheus_tail_monitor
    <labels>
      cluster local
      node worker007
      deployment es
    </labels>
  </source>
  <source>
    @type tail
    read_from_head true
    path "/var/log/containers/*.log"
    pos_file "/var/log/fluentd-k8s.pos"
    time_format %Y-%m-%dT%H:%M:%S.%L
    tag "k8s.*"
    format json
    <parse>
      time_format %Y-%m-%dT%H:%M:%S.%L
      @type json
      time_type string
    </parse>
  </source>
  <source>
    @type tail
    read_from_head true
    path "/var/lib/rancher/rke/log/*.log"
    pos_file "/var/log/fluentd-rancher-rke.pos"
    time_format %Y-%m-%dT%H:%M:%S.%L
    tag "rancher-rke.*"
    format json
    <parse>
      time_format %Y-%m-%dT%H:%M:%S.%L
      @type json
      time_type string
    </parse>
  </source>
  <filter k8s.**>
    @type kubernetes_metadata
    kubernetes_url "https://kubernetes.default.svc:443"
    ca_file "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
    bearer_token_file "/var/run/secrets/kubernetes.io/serviceaccount/token"
  </filter>
  <filter **>
    @type record_transformer
    <record>
      projectName es
      projectVersion 1.0.0
      logType ${tag_parts[0]}
      file ${tag}
    </record>
  </filter>
  <match **>
    @type elasticsearch
    host "es.example.com"
    port 12000
    user "user"
    password xxxxxx
    scheme https
    ssl_verify false
    ssl_version TLSv1_2
    logstash_format true
    reload_connections false
    reconnect_on_error true
    reload_on_failure false
    flatten_hashes true
    flatten_hashes_separator "_"
    @log_level "debug"
    <buffer>
      @type "file"
      path "/var/log/fluentd-es.buffer"
      flush_at_shutdown true
      flush_mode interval
      flush_interval 60s
      flush_thread_count 2
      chunk_limit_size 10MB
      chunk_limit_records 10000
    </buffer>
  </match>
</ROOT>
2020-03-12 10:10:23 +0000 [info]: starting fluentd-1.9.2 pid=7 ruby="2.6.5"
2020-03-12 10:10:23 +0000 [info]: spawn command to main:  cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd", "-c", "/fluentd/etc/fluentd.conf", "-p", "/fluentd/plugins", "--gemfile", "/fluentd/Gemfile", "-r", "/fluentd/vendor/bundle/ruby/2.6.0/gems/fluent-plugin-elasticsearch-4.0.4/lib/fluent/plugin/elasticsearch_simple_sniffer.rb", "--under-supervisor"]
2020-03-12 10:10:24 +0000 [info]: adding filter pattern="k8s.**" type="kubernetes_metadata"
2020-03-12 10:10:25 +0000 [info]: adding filter pattern="**" type="record_transformer"
2020-03-12 10:10:25 +0000 [info]: adding match pattern="**" type="elasticsearch"
2020-03-12 10:10:25 +0000 [debug]: #0 'host es.example.com' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'host: es.example.com' doesn't have tag placeholder
2020-03-12 10:10:25 +0000 [debug]: #0 'index_name fluentd' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'index_name: fluentd' doesn't have tag placeholder
2020-03-12 10:10:25 +0000 [debug]: #0 'template_name ' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'template_name: ' doesn't have tag placeholder
2020-03-12 10:10:25 +0000 [debug]: #0 'logstash_prefix logstash' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'logstash_prefix: logstash' doesn't have tag placeholder
2020-03-12 10:10:25 +0000 [debug]: #0 'deflector_alias ' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'deflector_alias: ' doesn't have tag placeholder
2020-03-12 10:10:25 +0000 [debug]: #0 'application_name default' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'application_name: default' doesn't have tag placeholder
2020-03-12 10:10:25 +0000 [debug]: #0 Need substitution: false
2020-03-12 10:10:25 +0000 [debug]: #0 'host_placeholder es.example.com' is tested built-in placeholder(s) but there is no valid placeholder(s). error: Parameter 'host_placeholder: es.example.com' doesn't have tag placeholder
2020-03-12 10:10:25 +0000 [info]: #0 Detected ES 6.x: ES 7.x will only accept `_doc` in type_name.
2020-03-12 10:10:25 +0000 [info]: adding source type="prometheus"
2020-03-12 10:10:25 +0000 [info]: adding source type="prometheus_monitor"
2020-03-12 10:10:25 +0000 [info]: adding source type="prometheus_output_monitor"
2020-03-12 10:10:25 +0000 [info]: adding source type="prometheus_tail_monitor"
2020-03-12 10:10:25 +0000 [info]: adding source type="tail"
2020-03-12 10:10:25 +0000 [info]: adding source type="tail"
2020-03-12 10:10:25 +0000 [warn]: #0 define <match fluent.**> to capture fluentd logs in top level is deprecated. Use <label @FLUENT_LOG> instead
2020-03-12 10:10:25 +0000 [info]: #0 starting fluentd worker pid=19 ppid=7 worker=0
2020-03-12 10:10:25 +0000 [debug]: #0 buffer started instance=69827688780720 stage_size=0 queue_size=0
2020-03-12 10:10:25 +0000 [debug]: #0 flush_thread actually running
2020-03-12 10:10:25 +0000 [debug]: #0 flush_thread actually running
2020-03-12 10:10:25 +0000 [debug]: #0 enqueue_thread actually running
2020-03-12 10:10:25 +0000 [info]: #0 following tail of /var/lib/rancher/rke/log/nginx-proxy_a6783e971fdd9e463a46e23309659b0accffd579eef65e0b1e2c20dd6cdf4c43.log
2020-03-12 10:10:25 +0000 [info]: #0 following tail of /var/lib/rancher/rke/log/kubelet_cc46029a026f355d90a144eaaf5d06515cf57fdb248cec32a622b7fc5da11e31.log
2020-03-12 10:10:25 +0000 [debug]: #0 Created new chunk chunk_id="5a0a58fc7a3f0721e875ea7672d05741" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=nil, tag="rancher-rke.var.lib.rancher.rke.log.kubelet_cc46029a026f355d90a144eaaf5d06515cf57fdb248cec32a622b7fc5da11e31.log", variables=nil>
  • paste result of fluent-gem list, td-agent-gem list or your Gemfile.lock
  • ES version: 6.3.2
@cosmo0920
Copy link
Collaborator

Could you use Simple Sniffer class to get rid of firing siniffer feature on Elasticsearch client?
https://github.com/uken/fluent-plugin-elasticsearch#sniffer-class-name

@nanorobocop
Copy link
Author

@cosmo0920 Thanks for your reply! Isn't using ElasticsearchSimpleSniffer equivalent to disabling reload functionality via reload_connections false and reload_on_failure false?

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Mar 16, 2020

Yes. If you don't use it, Elasticsearch client fires sniffer request in every 10000 bulk request. This is not related to reload_connections false and reload_on_failure false and not be able to prevent by these parameters..

@ukinau
Copy link
Contributor

ukinau commented Mar 22, 2020

Elasticsearch client fires sniffer request in every 10000 bulk request

This came from this line, and Elasticsearch client seems triger only when reload_connections true
https://github.com/elastic/elasticsearch-ruby/blob/master/elasticsearch-transport/lib/elasticsearch/transport/transport/base.rb#L77

and also other place, Elasticsearch client may trigger reload_connection with sniffer class is
when Elasticsearch client face Unreachable exception, but here it check reload_on_failure value that's why if reload_on_failure false is specified, even when Elasticsearch client face Unreachable exception, host list reload will not be triggered.
https://github.com/elastic/elasticsearch-ruby/blob/master/elasticsearch-transport/lib/elasticsearch/transport/transport/base.rb#L299-L301

Sniffer class is not evaluated if host reload(https://github.com/elastic/elasticsearch-ruby/blob/f877b6cb7fe905b1492120213947dc60afbdacc8/elasticsearch-transport/lib/elasticsearch/transport/transport/base.rb#L85) is not triggered, and I can find just above 2 places which may trigger host reload, that's why if reload_connections false and reload_on_failure false are configured, sniffer class is not matter, I think.

Actually I investigate this issue, and I found root cause, so could you check it out ?

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Apr 22, 2020

#733 is merged. This should be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants