Fluentd have big Recv-Queue #3911

zvlb · 2022-10-06T07:38:26Z

Describe the bug

I have K8s cluster, where I deploy:

FluentBit DaemonSet.
Fluentd StatefulSet
(I'm using Logging-operator to deploy it)

FluentBit send all logs to Fluentd. Fluentd process logs and sends all to elastic.
In my installation, I have 50 pods of FLuentd.
In fluentBit logs periodically I see:

[2022/10/01 06:02:46] [error] [upstream] connection #1158 to fluentd:24240 timed out after 10 seconds
[2022/10/01 06:02:46] [error] [output:forward:forward.0] no upstream connections available
[2022/10/01 06:02:46] [ warn] [engine] failed to flush chunk '1-1664603612.450502668.flb', retry in 8 seconds: task_id=658, input=tail.0 > output=forward.0 (out_id=0)

When I check FluentD I see a big Recv-Q:

$ netstat -ntpl
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:24444         0.0.0.0:*               LISTEN      7/ruby
tcp        0      0 0.0.0.0:24231           0.0.0.0:*               LISTEN      166916/ruby
tcp     1022      0 0.0.0.0:24240           0.0.0.0:*               LISTEN      166916/ruby
tcp        0      0 :::9533                 :::*                    LISTEN      -

and sometimes FluetnD stops listening port 24240
How can I fix It?

To Reproduce

Install Logging-Operator with many Flows (more, them 3000)

Expected behavior

All work without Recv-Queue and errors in Fluentbit

Your Environment

- Fluentd version: 1.14.6
- Operating system: Alpine Linux v3.14
- Kernel version: 5.4.0-105-generic

Your Configuration

I have very big fluentd.conf (more, then 10MB)

Your Error Log

[2022/10/01 06:02:46] [error] [upstream] connection #1158 to fluentd:24240 timed out after 10 seconds
[2022/10/01 06:02:46] [error] [output:forward:forward.0] no upstream connections available
[2022/10/01 06:02:46] [ warn] [engine] failed to flush chunk '1-1664603612.450502668.flb', retry in 8 seconds: task_id=658, input=tail.0 > output=forward.0 (out_id=0)



### Additional context

_No response_

The text was updated successfully, but these errors were encountered:

fujimotos · 2022-10-21T09:40:09Z

When I check FluentD I see a big Recv-Q:

This is very probably a deployment issue. Fluentd is just being too overtaxed.

You need to distribute the load by adding more instances, or control the
incoming flow so thaf Fluentd can catch up with the data voulme.

fujimotos closed this as completed Oct 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fluentd have big Recv-Queue #3911

Fluentd have big Recv-Queue #3911

zvlb commented Oct 6, 2022

fujimotos commented Oct 21, 2022

Fluentd have big Recv-Queue #3911

Fluentd have big Recv-Queue #3911

Comments

zvlb commented Oct 6, 2022

Describe the bug

To Reproduce

Expected behavior

Your Environment

Your Configuration

Your Error Log

fujimotos commented Oct 21, 2022