fluent-bit fails to forward logs with 'no upstream connections available to <endpoint>'

## Bug Report

**Describe the bug**
   We are forwarding the logs to splunk and we found that time to time getting the below error in fluentbit pods.
   **Fluentbit pods** -
      [2023/03/28 12:37:55] [error] [net] TCP connection failed: splunk-fluentd.monitoring.svc.cluster.local:24240 (Connection refused)
[2023/03/28 12:37:55] [error] [output:forward:forward.0] no upstream connections available
[2023/03/28 12:37:55] [ warn] [engine] failed to flush chunk '1-1680007074.805347894.flb', retry in 7 seconds: task_id=0, input=tail.0 > output=forward.0 (out_id=0)
[2023/03/28 12:37:55] [error] [net] TCP connection failed: splunk-fluentd.monitoring.svc.cluster.local:24240 (Connection refused)
[2023/03/28 12:37:55] [error] [output:forward:forward.0] no upstream connections available
[2023/03/28 12:37:55] [ warn] [engine] failed to flush chunk '1-1680007074.856843682.flb', retry in 6 seconds: task_id=2, input=tail.0 > output=forward.0 (out_id=0)
[2023/03/28 12:37:56] [error] [net] TCP connection failed: splunk-fluentd.monitoring.svc.cluster.local:24240 (Connection refused)
[2023/03/28 12:37:56] [error] [output:forward:forward.0] no upstream connections available
[2023/03/28 12:37:56] [error] [net] TCP connection failed: splunk-fluentd.monitoring.svc.cluster.local:24240 (Connection refused)
[2023/03/28 12:37:56] [error] [output:forward:forward.0] no upstream connections available

**fluentd pods also having some error** -


2023-03-29 09:52:08 +0000 [warn]: #0 [flow:outputflowname] failed to flush the buffer. retry_times=5 next_retry_time=2023-03-29 09:52:40 +0000 chunk="5f806e32832e16de5cdf89bf8714d9e6" error_class=RuntimeError error="Server error (502) for POST https://splunkendpoint/services/collector, response: \n<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>502 Server Error</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Server Error</h1>\n<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>\n<h2></h2>\n</body></html>\n"

2023-03-29 09:52:25 +0000 [warn]: #0 [flow:outputflowname] failed to flush the buffer. retry_times=0 next_retry_time=2023-03-29 09:52:26 +0000 chunk="5f806eaa0c2dec2aa728034f9cc4b3d0" error_class=RuntimeError error="Server error (502) for POST https://splunkendpoint/services/collector, response: \n<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>502 Server Error</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Server Error</h1>\n<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>\n<h2></h2>\n</body></html>\n"

2023-03-29 09:52:26 +0000 [warn]: #0 [flow:outputflowname] failed to flush the buffer. retry_times=6 next_retry_time=2023-03-29 09:53:30 +0000 chunk="5f806e1c8fce315d338f5e689b8a0e03" error_class=RuntimeError error="Server error (502) for POST https://splunkendpoint/services/collector, response: \n<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>502 Server Error</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Server Error</h1>\n<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>\n<h2></h2>\n</body></html>\n"

Fluentbit config:

[SERVICE]
    Flush        1
    Grace        5
    Daemon       Off
    Log_Level    warning
    Parsers_File parsers.conf
    Coro_Stack_Size    24576
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020
    storage.path  /buffers

[INPUT]
    Name         tail
    DB  /tail-db/tail-containers-state.db
    DB.locking  true
    Exclude_Path  *_kube-system_*,*_cnrm-system_*,*_monitoring_*,*_bats-test_*,*_management-system_*,*_argocd_*,*_managed-operators_*,*_configconnector-operator-system_*
    Mem_Buf_Limit  128MB
    Parser  cri
    Path  /var/log/containers/*.log
    Refresh_Interval  5
    Skip_Long_Lines  On
    Tag  kubernetes.*
[FILTER]
    Name        kubernetes
    Buffer_Size  0
    Kube_CA_File  /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Tag_Prefix  kubernetes.var.log.containers
    Kube_Token_File  /var/run/secrets/kubernetes.io/serviceaccount/token
    Kube_URL  https://kubernetes.default.svc:443
    Match  kubernetes.*
    Merge_Log  On
    Use_Kubelet  Off

[OUTPUT]
    Name          forward
    Match         *
    Host          splunk-fluentd.monitoring.svc.cluster.local
    Port          24240
    
    net.keepalive on
    net.keepalive_idle_timeout 30
    net.keepalive_max_recycle 100
    Retry_Limit  50


  
**To Reproduce**
- Rubular link if applicable:
- Example log message if applicable:
```
{"log":"YOUR LOG MESSAGE HERE","stream":"stdout","time":"2018-06-11T14:37:30.681701731Z"}
```
- Steps to reproduce the problem:

**Expected behavior**
 We would like to fix the error which capture those pods .

**Screenshots**


**Your Environment**
 All the environment 
*** Version used:**
            - name: logging-operator
              # repository: https://kubernetes-charts.banzaicloud.com
              version: 3.17.9
              fluent/fluent-bit:1.9.5
              fluentd:v1.14.6-alpine-5
* Configuration:
* Environment name and version (e.g. Kubernetes? What version?):
* Server type and version:
* Operating System and version:
* Filters and plugins:

**Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fluent-bit fails to forward logs with 'no upstream connections available to <endpoint>' #7086

Bug Report

Error: Server Error

The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.

Error: Server Error

The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.

Error: Server Error

The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fluent-bit fails to forward logs with 'no upstream connections available to <endpoint>' #7086

Description

Bug Report

Error: Server Error

The server encountered a temporary error and could not complete your request.Please try again in 30 seconds.

Error: Server Error

The server encountered a temporary error and could not complete your request.Please try again in 30 seconds.

Error: Server Error

The server encountered a temporary error and could not complete your request.Please try again in 30 seconds.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.

The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.

The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.