Tweak memory limits for out_kafka example #18

solsson · 2018-01-22T20:33:42Z

WIP based on discussion in #16

StevenACoffman · 2018-01-24T19:19:02Z

I figured out one problem I was having with the tight memory and cpu limits:

From @leahnp on May 17, 2017 0:6

Add memory limit to deployment yaml. Test special case: in long-running clusters with lots of pre-existing logs, deploy Fluent-bit, initial workload is very heavy then it evens out. If it hits the memory limit in this initial processing it will continually be killed and re-created.

Copied from original issue: samsung-cnct/kraken-logging-fluent-bit-daemonset#5

Moved to samsung-cnct/chart-fluent-bit#9

in particular at pod start on nodes with unprocessed logs

solsson · 2018-01-26T07:49:35Z

@edsiper I think @StevenACoffman's observation above is interesting input to the lack of "check" in out_kafka. Spikes in memory use at pod start are impractical. Can log processing be halted when kafka buffers hit a size limit? Would it be possible to add output buffer size to prometheus metrics?

In addition I've found a tentative explanation for log messages Receive failed: Disconnected. See Yolean/kubernetes-kafka#132 (comment). Can be ignored, as not an issue with fluent-bit. But, like with request.required.acks, I would like to be able to set librdkafka properties.

…-kafka

solsson · 2018-03-01T20:29:13Z

I had a case now with 0.9 and current memory limit (60Mi) where I one pod went crashlooping. Raising to 100Mi didn't help but to 200Mi did. The crashes happened too fast after start for me to get any meaningful metrics out of it. Now that everything is up and running again Prometheus has no memory use value above 50Mi.

At info log level there was nothing out of the ordinary in pod logs. At debug level the last lines before container exit were:

[2018/03/01 20:00:20] [debug] [out_kafka] enqueued message (1171 bytes) for topic 'ops.kube-logs-fluentbit.stream.json.001'
[2018/03/01 20:00:20] [debug] [out_kafka] message delivered (1133 bytes, partition 0)
[2018/03/01 20:00:20] [debug] [in_tail] file=/var/log/containers/integrations-59c6f5bd46-9n8pb_essity_integrations-32fdb2298f4d670dd1e1b5d0b2f0ae7745bd2a25456c7e07ab5e43be82424522.log event
[2018/03/01 20:00:20] [debug] [input tail.0] [mem buf] size = 2137648
[2018/03/01 20:00:20] [debug] [in_tail] file=/var/log/containers/logs-fluentbit-5d55d88694-d7fwf_test-kafka_testcase-d079b114d53aad4f0f894437c1b53a494afa752fc733c22d19838287d8b13c2d.log read=32693 lines=20

Unfortunately I think the log file got truncated before I managed to pull it out from the node, because I see no entries in it from the time of the crash.

I've restored the 60 Mi limit. Let's see if this happens again. It's the only unexpected pod restart I've had since this PR was merged.

Enables rolling update, to simplify memory limit change

8075ff2

solsson mentioned this pull request Jan 24, 2018

epoll_ctl: File exists, errno=17 fluent/fluent-bit#488

Closed

Raises memory limit a lot to fit kafka producer buffers,

6150eb3

in particular at pod start on nodes with unprocessed logs

solsson added 2 commits January 26, 2018 08:50

Reverts unintentional CPU limit change

bb144fa

Merge remote-tracking branch 'fluent/0.13-dev' into memory-limits-out…

4e56268

…-kafka

StevenACoffman mentioned this pull request Jan 30, 2018

Unstable using latest from 0.13-dev branch #17

Open

edsiper merged commit d0aca5d into fluent:0.13-dev Jan 30, 2018

solsson mentioned this pull request Feb 6, 2018

CPU and Memory Requests and Limits? #14

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tweak memory limits for out_kafka example #18

Tweak memory limits for out_kafka example #18

solsson commented Jan 22, 2018

StevenACoffman commented Jan 24, 2018 •

edited

Loading

solsson commented Jan 26, 2018

solsson commented Mar 1, 2018

Tweak memory limits for out_kafka example #18

Tweak memory limits for out_kafka example #18

Conversation

solsson commented Jan 22, 2018

StevenACoffman commented Jan 24, 2018 • edited Loading

solsson commented Jan 26, 2018

solsson commented Mar 1, 2018

StevenACoffman commented Jan 24, 2018 •

edited

Loading