td-agent memory usage gradually creeping upwards #1414

hjet · 2017-01-10T22:38:36Z

See also #1384 (seems similar):

- fluentd or td-agent version.
fluentd-0.14.10

- Environment information, e.g. OS.
Debian GNU/Linux 8.6 (jessie)
3.16.0-4-amd64

- Your configuration
(please excuse the possibly poor or nonstandard config – I am inheriting this from another developer and new to fluentd)

<source>
  @type monitor_agent
  bind 0.0.0.0
  port 65000
</source>
<source>
  @type syslog
  port 65001
  tag system
</source>
<source>
  @type tail
  read_from_head true
  read_lines_limit 250
  refresh_interval 20
  pos_file /var/log/td-agent/tmp/stderr.log.pos
  path /tmp/mesos/slaves/*/frameworks/*/executors/*.*/runs/latest/stderr
  exclude_path ["/tmp/mesos/slaves/*/frameworks/*/executors/*job-scheduler*.*/runs/latest/stderr"]
  tag mesos.*
  format multiline_grok
  multiline_start_regexp /^[^\s]/
  custom_pattern_path /etc/td-agent/custom_patterns
  <grok>
    pattern %{GREEDYDATA:logger_info}%{LEVEL}%{GREEDYDATA:log_message}
  </grok>
  <grok>
    pattern %{GREEDYDATA:message}
  </grok>
</source>
<source>
  @type tail
  read_from_head true
  read_lines_limit 250
  refresh_interval 20
  pos_file /var/log/td-agent/tmp/stdout.log.pos
  path /tmp/mesos/slaves/*/frameworks/*/executors/*.*/runs/latest/stdout
  exclude_path ["/tmp/mesos/slaves/*/frameworks/*/executors/*job-scheduler*.*/runs/latest/stdout"]
  tag mesos.*
  format multiline_grok
  multiline_start_regexp /^[^\s]/
  custom_pattern_path /etc/td-agent/custom_patterns
  <grok>
    pattern %{GREEDYDATA:logger_info}%{LEVEL}%{GREEDYDATA:log_message}
  </grok>
  <grok>
    pattern %{GREEDYDATA:message}
  </grok>
</source>
<source>
  @type tail
  read_from_head true
  read_lines_limit 250
  refresh_interval 20
  path /tmp/mesos/slaves/*/frameworks/*/executors/*.apps/runs/*/logs/*.log
  pos_file /var/log/td-agent/tmp/apps.stdout.log.pos
  tag apps.*
  format multiline_grok
  multiline_start_regexp /^[^\s]/
  custom_pattern_path /etc/td-agent/custom_patterns
  <grok>
    pattern %{GREEDYDATA:logger_info}%{LEVEL}%{GREEDYDATA:log_message}
  </grok>
  <grok>
    pattern %{GREEDYDATA:message}
  </grok>
</source>
<source>
  @type tail
  read_from_head true
  read_lines_limit 250
  refresh_interval 20
  pos_file /var/log/td-agent/tmp/job-scheduler.stderr.log.pos
  path /tmp/mesos/slaves/*/frameworks/*/executors/*job-scheduler*.*/runs/latest/stderr
  tag mesos.*
  format multiline_grok
  multiline_start_regexp /^[^\s]/
  custom_pattern_path /etc/td-agent/custom_patterns
  <grok>
    pattern %{GREEDYDATA:logger_info}%{LEVEL}%{GREEDYDATA:log_message}
  </grok>
  <grok>
    pattern %{GREEDYDATA:message}
  </grok>
</source>
<source>
  @type tail
  read_from_head true
  read_lines_limit 250
  refresh_interval 20
  pos_file /var/log/td-agent/tmp/job-scheduler.stdout.log.pos
  path /tmp/mesos/slaves/*/frameworks/*/executors/*job-scheduler*.*/runs/latest/stdout
  tag mesos.*
  format multiline_grok
  multiline_start_regexp /^[^\s]/
  custom_pattern_path /etc/td-agent/custom_patterns
  <grok>
   ----- OMITTED -------
  </grok>
  <grok>
   ----- OMITTED -------
  <grok>
    pattern %{GREEDYDATA:logger_info}%{LEVEL}%{GREEDYDATA:log_message}
  </grok>
  <grok>
    pattern %{GREEDYDATA:message}
  </grok>
</source>
# Job-Scheduler: Launchers  e.g. framework ct:1473451200000:0:bfe92d4217_launcher:
<source>
  @type tail
  read_from_head true
  read_lines_limit 250
  refresh_interval 20
  pos_file /var/log/td-agent/tmp/job-scheduler_launchers.stderr.log.pos
  path /tmp/mesos/slaves/*/frameworks/*/executors/ct:*/runs/latest/stderr
  tag launcher_scheduler.*
  format multiline_grok
  multiline_start_regexp /^[^\s]/
  custom_pattern_path /etc/td-agent/custom_patterns
  <grok>
    pattern %{GREEDYDATA:logger_info}%{LEVEL}%{GREEDYDATA:log_message}
  </grok>
  <grok>
    pattern %{GREEDYDATA:message}
  </grok>
</source>
<source>
  @type tail
  read_from_head true
  read_lines_limit 250
  refresh_interval 20
  pos_file /var/log/td-agent/tmp/job-scheduler_launchers.stdout.log.pos
  path /tmp/mesos/slaves/*/frameworks/*/executors/ct:*/runs/latest/stdout
  tag launcher_scheduler.*
  format multiline_grok
  multiline_start_regexp /^[^\s]/
  custom_pattern_path /etc/td-agent/custom_patterns
  <grok>
   ------ OMITTED -------
  </grok>
  <grok>
   ------ OMITTED -------
  </grok>
  <grok>
    pattern %{GREEDYDATA:logger_info}%{LEVEL}%{GREEDYDATA:log_message}
  </grok>
  <grok>
    pattern %{GREEDYDATA:message}
  </grok>
</source>
<filter **>
  @type elasticsearch_timestamp_check
</filter>
<filter>
  @type record_transformer
  <record>
    hostname ${hostname}
  </record>
</filter>
<filter mesos.**>
  @type record_transformer
  <record>
    site-id ${tag_parts[8]}
  </record>
</filter>
<filter mesos.**>
  @type record_transformer
  <record>
    task ${tag_parts[9]}
  </record>
</filter>
<filter apps.**>
  @type record_transformer
  enable_ruby true
  <record>
    site-id ${tag_parts[8]}
  </record>
  <record>
    app ${tag_parts[13].split('_')[1]}
  </record>
</filter>
<match *.**>
  @type secure_forward
  self_hostname ${hostname}
  shared_key xxxxxxxxx
  ca_cert_path /etc/td-agent/fluentd-ssl/ca_cert.pem
  secure yes
  enable_strict_verification yes
  num_threads 2
  <server>
    host xxxxxxx
    port xxxxxxx
  </server>
</match>

- Your problem explanation. If you have an error logs, write it together.

We have td-agent running on several mesos agents, tailing various log files, etc. (from the conf you will observe that we use the * format in path) – it usually is tailing a large number of files that often get created and then subsequently deleted (but never rotated).

On some agents, td-agent memory usage mysteriously begins growing over the course of several days, to the point where it begins using a ridiculous amount of memory and needs to be killed. Sending a SIGTERM via service restart usually works but takes some time (~10 mins), and memory usage returns to normal upon restart. I can recreate the problem (and it is currently occurring), so let me know what further diagnostic information to provide and I will be happy to help. Also, once again, please forgive the poor configuration, as I said earlier I am inheriting this from someone else.

I am including logs and diagnostic information for both a low mem usage (normal operation) td-agent and a high mem usage (faulty operation) td-agent for ease of comparison:

low mem td-agent.log:
https://gist.github.com/hjet/329ff5abe38efbf5b68c55328a6925a1

high mem td-agent.log:
https://gist.github.com/hjet/cf39a32146edffbf61b651dff482e6e5

monitor_agent (for both):
https://gist.github.com/hjet/769e581918b69a3031657a7bfbf7dfb3

sigdumps (low mem usage):
https://gist.github.com/hjet/6b46311466d2487592db3f9feb1e0279

sigdumps (high mem usage):
https://gist.github.com/hjet/1fbead9a120fc5efb05fdc66cefee8c6

strace (low mem usage):
https://gist.github.com/hjet/b6c969196ea1a6488559c84aeb442175

strace (high mem usage):
https://gist.github.com/hjet/866c17ab9e8891dcb00c446f60ca3c28

perf (low mem usage):

perf (high mem usage):

pid2line.rb (low mem usage):

pid2line.rb (high mem usage):

Please let me know what other information I can provide to help!

Also not sure if the error="no one nodes with valid ssl session" is part of the problem (flushing the buffer did not alleviate memory pressure) – so some insight there would be appreciated as well (I am planning on fixing this issue at the same time).

Thank you!

The text was updated successfully, but these errors were encountered:

repeatedly · 2017-01-11T00:09:58Z

Also not sure if the error="no one nodes with valid ssl session" is part of the problem

It means out_secure_forward can't flush buffer to destination and it causes growing memory usage?
Or buffer length is low but memory usage is high?

And how about dentry cache?

hjet · 2017-01-11T00:32:48Z

It means out_secure_forward can't flush buffer to destination and it causes growing memory usage?

I have confirmed that force flushing the buffer (sending SIGUSR1) doesn't reduce memory usage.

Or buffer length is low but memory usage is high?

Can you clarify what you mean by this? This is precisely the problem. Memory usage increases gradually over the course of several days to the point of consuming most of the memory on the server.

And how about dentry cache?

How do I check this?

Thanks for your quick response!

repeatedly · 2017-01-11T00:37:21Z

I have confirmed that force flushing the buffer (sending SIGUSR1) doesn't reduce memory usage.

When flush the buffer?
If secure_forward has error="no one nodes with valid ssl session" errors, force flushing doesn't work because there is no valid destination.

Can you clarify what you mean by this?

Your secure_forward setting uses memory buffer. It means the secure_forward can't flush the buffer chunks, memory usage is growing unlike file buffer.

How do I check this?

Googling it is faster than my comment ;)
I forgot actual commands.
If application touches lots of files, dentry cache is also growing. I'm not sure this is the part of problem.

hjet · 2017-01-11T00:52:50Z

I have just force flushed the buffer:

No error was thrown. Memory usage is at ~2.17 GB.

monitor_agent shows the following:

      {
            "buffer_queue_length": 0,
            "buffer_total_queued_size": 411256,
            "config": {
                "@type": "secure_forward",
                "ca_cert_path": "/etc/td-agent/fluentd-ssl/ca_cert.pem",
                "enable_strict_verification": "yes",
                "num_threads": "2",
                "secure": "yes",
                "self_hostname": "xxxxx",
                "shared_key": "xxxxxx"
            },
            "output_plugin": true,
            "plugin_category": "output",
            "plugin_id": "object:3ff5798316f8",
            "retry": {},
            "retry_count": 118,
            "type": "secure_forward"
        },

The "buffer_total_queued_size": 411256, confirms your theory I believe. I guess there's no error thrown because it's using the exponential backoff retry mechanism (and has failed many subsequent times)?

So that then prompts another question: what is usually the cause of error="no one nodes with valid ssl session" and how do I diagnose/fix this? Some of the logs clearly seem to be shipped correctly (lots of retry succeeded. chunk_id="....." messages) – but then sometimes it fails.

I guess this would probably be more of an issue for https://github.com/tagomoris/fluent-plugin-secure-forward

I just want to confirm that the root cause of the issue is secure_forward/ not being able to find a valid destination to dump the chunk, and nothing else.

Thanks again!

repeatedly · 2017-01-11T01:06:18Z

So I guess that prompts another question, what is usually the cause of error="no one nodes with valid ssl session"? How do I fix this?

This is very hard question. Check secure-forward plugin issues is better.

hjet · 2017-01-11T01:07:22Z

Ok great, thanks again.

repeatedly · 2017-01-11T01:20:12Z

BTW, if you want to reduce the ruby's memory usage itself, set RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR envrionemnt variable to 0.9 may help.

hjet · 2017-01-11T16:51:39Z

Thanks!

hjet · 2017-02-03T18:38:36Z

This is reoccurring, and I have disabled secure-forward (using elasticsearch output plugin with no SSL). Furthermore the logs no longer contain any error="no one nodes with valid ssl session messages, nor any other connection related messages. Everything seems normal, except for a [info]: flushing all buffer forcedly roughly ~9 hours after starting td-agent.

I am restarting with verbose logging and the issue is reproducible and will almost certainly reoccur, so please let me know how to proceed, and which logs and diagnostic info to provide. If possible, can we move this to a more private channel, or can I email you the full logs? Some may potentially contain sensitive information and I would like to provide you with a full set of logs instead of the excerpts I was able to post above.

Thanks again.

hjet · 2017-02-03T18:45:10Z

The new match section looks as follows:

<match *.**>
  @type elasticsearch
  host xxxxxx
  port 9200
  include_tag_key true
  tag_key @log_name
  logstash_format true
  buffer_type memory
  buffer_chunk_limit 64m
  buffer_queue_limit 175
  flush_interval 20
  disable_retry_limit false
  retry_limit 15
  retry_wait 2
  request_timeout 30
  reload_connections false
</match>

and from the logs:

<match *.**>
    @type elasticsearch
    host "xxxxxx"
    port 9200
    include_tag_key true
    tag_key "@log_name"
    logstash_format true
    buffer_type "memory"
    buffer_chunk_limit 64m
    buffer_queue_limit 175
    flush_interval 20
    disable_retry_limit false
    retry_limit 15
    retry_wait 2
    request_timeout 30
    reload_connections false
    <buffer tag>
      flush_mode interval
      retry_type exponential_backoff
      @type memory
      flush_interval 20
      retry_forever false
      retry_max_times 15
      chunk_limit_size 64m
      queue_length_limit 175
    </buffer>
    <inject>
      tag_key @log_name
    </inject>
  </match>

hjet · 2017-02-03T18:52:22Z

Also from the logs, the only interesting messages are as follows:

2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486011874000:0:fa06769a7b_launcher:/runs/latest/stdout
2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486013027000:0:e9fe5590b4_launcher:/runs/latest/stdout
2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486014299000:0:0fbd3568d2_launcher:/runs/latest/stdout
2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486014376000:0:89625e870d_launcher:/runs/latest/stdout
2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486018554000:0:b5a3cab503_launcher:/runs/latest/stdout
2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486021314000:0:80cb94ee7c_launcher:/runs/latest/stdout
2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486022453000:0:afb0546592_launcher:/runs/latest/stdout
2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486024200000:0:99cbcd2263_launcher:/runs/latest/stdout
2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486024822000:0:2232e104dd_launcher:/runs/latest/stdout
2017-02-02 21:21:28 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486027491000:0:58f22e2cea_launcher:/runs/latest/stdout

etc...

and


2017-02-02 21:21:29 +0000 [info]: listening syslog socket on 0.0.0.0:65001 with udp
2017-02-02 21:21:29 +0000 [info]: fluentd worker is now running
2017-02-02 21:21:29 +0000 [info]: disable filter chain optimization because [Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` method.
2017-02-02 21:21:43 +0000 [info]: disable filter chain optimization because [Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` method.
2017-02-02 21:21:46 +0000 [info]: disable filter chain optimization because [Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` method.
2017-02-02 21:21:46 +0000 [info]: disable filter chain optimization because [Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` method.
2017-02-02 21:21:49 +0000 [info]: Connection opened to Elasticsearch cluster => {:host=>"xxxxx", :port=>9200, :scheme=>"http"}
2017-02-02 21:22:01 +0000 [info]: disable filter chain optimization because [Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` method.
2017-02-02 21:26:44 +0000 [info]: detected rotation of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/27ffe398-bec5-46e4-b644-d2132dbdd54b-93820/executors/fa06769a7b.metadata/runs/latest/stderr; waiting 5 seconds
2017-02-02 21:26:44 +0000 [info]: detected rotation of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/27ffe398-bec5-46e4-b644-d2132dbdd54b-93820/executors/fa06769a7b.metadata/runs/latest/stdout; waiting 5 seconds
2017-02-02 21:29:43 +0000 [info]: detected rotation of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486011874000:0:fa06769a7b_launcher:/runs/latest/stdout; waiting 5 seconds
2017-02-02 21:29:43 +0000 [info]: detected rotation of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/

and then a

2017-02-03 06:25:02 +0000 [info]: force flushing buffered events

(notice the time of the message)

and then more of the above sorts of messages (no errors or anything).

hjet · 2017-02-05T02:36:09Z

Memory usage still growing. Here is an error that came up in the logs:

2017-02-05 01:18:50 +0000 [info]: detected rotation of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/51d8d228-78f8-4201-af98-e275f9898cbb-4800/executors/ct:1486
204744000:0:e2f8698e84_launcher:/runs/latest/stdout; waiting 5 seconds
2017-02-05 01:18:50 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/27ffe398-bec5-46e4-b644-d2132dbdd54b-98010/executors/3bd13b6d6
7.parquet-cds-customer/runs/latest/stderr
2017-02-05 01:18:50 +0000 [info]: disable filter chain optimization because [Fluent::Plugin::RecordTransformerFilter, Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` met
hod.
2017-02-05 01:18:50 +0000 [info]: following tail of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/27ffe398-bec5-46e4-b644-d2132dbdd54b-98010/executors/3bd13b6d6
7.analysis-campaigns/runs/latest/stderr
2017-02-05 01:18:50 +0000 [info]: disable filter chain optimization because [Fluent::Plugin::RecordTransformerFilter, Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` met
hod.
2017-02-05 01:18:55 +0000 [error]: closed stream
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.14.11/lib/fluent/plugin/in_tail.rb:585:in `readpartial'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.14.11/lib/fluent/plugin/in_tail.rb:585:in `on_notify'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.14.11/lib/fluent/plugin/in_tail.rb:455:in `detach'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.14.11/lib/fluent/plugin/in_tail.rb:283:in `detach_watcher'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.14.11/lib/fluent/plugin/in_tail.rb:293:in `block in detach_watcher_after_rotate_wait'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.14.11/lib/fluent/plugin_helper/timer.rb:77:in `call'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.14.11/lib/fluent/plugin_helper/timer.rb:77:in `on_timer'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.4.5/lib/cool.io/loop.rb:88:in `run_once'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/cool.io-1.4.5/lib/cool.io/loop.rb:88:in `run'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.14.11/lib/fluent/plugin_helper/event_loop.rb:77:in `block in start'
  2017-02-05 01:18:55 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.14.11/lib/fluent/plugin_helper/thread.rb:66:in `block in thread_create'
2017-02-05 01:18:55 +0000 [error]: closed stream
  2017-02-05 01:18:55 +0000 [error]: suppressed same stacktrace
2017-02-05 01:18:55 +0000 [info]: disable filter chain optimization because [Fluent::Plugin::RecordTransformerFilter, Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` met
hod.
2017-02-05 01:23:46 +0000 [info]: detected rotation of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/27ffe398-bec5-46e4-b644-d2132dbdd54b-97178/executors/3ed3380e4c.events/runs/latest/stdout; waiting 5 seconds
2017-02-05 01:23:46 +0000 [info]: detected rotation of /tmp/mesos/slaves/27ffe398-bec5-46e4-b644-d2132dbdd54b-S31/frameworks/27ffe398-bec5-46e4-b644-d2132dbdd54b-97178/executors/3ed3380e4c.events/runs/latest/stderr; waiting 5 seconds

hjet · 2017-02-06T13:36:26Z

@tagomoris may be related to #1434?

hjet · 2017-02-06T13:42:18Z

Also a curl http://localhost:65000/api/plugins.json takes around 10-15s....
Memory usage currently at ~32% and 22gb.

repeatedly · 2017-02-15T13:52:04Z

#1467 may fix this problem.
Could you try this patch?

hjet · 2017-02-24T00:47:19Z

I can and will try it soon. Sorry for the delay, have been very busy lately.

Does that same issue exist in 0.12?

repeatedly · 2017-02-24T00:57:08Z

Does that same issue exist in 0.12?

I'm not sure but I didn't receive the report from users.

repeatedly · 2017-03-15T21:39:56Z

Closed. if you have same problem with latest fluentd v0.14, please reopen it.

hjet closed this as completed Jan 11, 2017

hjet reopened this Feb 3, 2017

t-osakada mentioned this issue Feb 9, 2017

in_tail: Memory usage continuous to increase #1455

Closed

repeatedly closed this as completed Mar 15, 2017

VelorumS mentioned this issue Oct 12, 2017

0.14.11 td-agent memory usage gradually creeping upwards #1715

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

td-agent memory usage gradually creeping upwards #1414

td-agent memory usage gradually creeping upwards #1414

hjet commented Jan 10, 2017 •

edited

repeatedly commented Jan 11, 2017 •

edited

hjet commented Jan 11, 2017

repeatedly commented Jan 11, 2017 •

edited

hjet commented Jan 11, 2017 •

edited

repeatedly commented Jan 11, 2017

hjet commented Jan 11, 2017

repeatedly commented Jan 11, 2017

hjet commented Jan 11, 2017

hjet commented Feb 3, 2017

hjet commented Feb 3, 2017

hjet commented Feb 3, 2017

hjet commented Feb 5, 2017

hjet commented Feb 6, 2017

hjet commented Feb 6, 2017

repeatedly commented Feb 15, 2017

hjet commented Feb 24, 2017 •

edited

repeatedly commented Feb 24, 2017

repeatedly commented Mar 15, 2017

td-agent memory usage gradually creeping upwards #1414

td-agent memory usage gradually creeping upwards #1414

Comments

hjet commented Jan 10, 2017 • edited

repeatedly commented Jan 11, 2017 • edited

hjet commented Jan 11, 2017

repeatedly commented Jan 11, 2017 • edited

hjet commented Jan 11, 2017 • edited

repeatedly commented Jan 11, 2017

hjet commented Jan 11, 2017

repeatedly commented Jan 11, 2017

hjet commented Jan 11, 2017

hjet commented Feb 3, 2017

hjet commented Feb 3, 2017

hjet commented Feb 3, 2017

hjet commented Feb 5, 2017

hjet commented Feb 6, 2017

hjet commented Feb 6, 2017

repeatedly commented Feb 15, 2017

hjet commented Feb 24, 2017 • edited

repeatedly commented Feb 24, 2017

repeatedly commented Mar 15, 2017

hjet commented Jan 10, 2017 •

edited

repeatedly commented Jan 11, 2017 •

edited

repeatedly commented Jan 11, 2017 •

edited

hjet commented Jan 11, 2017 •

edited

hjet commented Feb 24, 2017 •

edited