Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default S3 Setup Documentation Results in Broken Config for EKS #4654

Closed
eyespies opened this issue Jan 21, 2022 · 5 comments
Closed

Default S3 Setup Documentation Results in Broken Config for EKS #4654

eyespies opened this issue Jan 21, 2022 · 5 comments

Comments

@eyespies
Copy link

Bug Report

Describe the bug
Using the latest FluentBit Helm Chart I deployed FluentBit 1.8.11 to an EKS cluster and it ran great. I then added the S3 output and noticed that the container was in a CrashLoopBackoff state.

To Reproduce

  • Rubular link if applicable:
  • Example log message if applicable:
[2022/01/21 20:13:27] [ info] [output:s3:s3.2] Using upload size 250000000 bytes
[2022/01/21 20:13:27] [debug] [aws_credentials] Initialized Env Provider in standard chain
[2022/01/21 20:13:27] [debug] [aws_credentials] Initialized AWS Profile Provider in standard chain
[2022/01/21 20:13:27] [debug] [aws_credentials] Not initializing EKS provider because AWS_ROLE_ARN was not set
[2022/01/21 20:13:27] [debug] [aws_credentials] Not initializing ECS Provider because AWS_CONTAINER_CREDENTIALS_RELATIVE_URI is not set
[2022/01/21 20:13:27] [debug] [aws_credentials] Initialized EC2 Provider in standard chain
[2022/01/21 20:13:27] [debug] [aws_credentials] Sync called on the EC2 provider
[2022/01/21 20:13:27] [debug] [aws_credentials] Init called on the env provider
[2022/01/21 20:13:27] [debug] [aws_credentials] Init called on the profile provider
[2022/01/21 20:13:27] [debug] [aws_credentials] Reading shared config file.
[2022/01/21 20:13:27] [debug] [aws_credentials] Shared config file /root/.aws/config does not exist
[2022/01/21 20:13:27] [debug] [aws_credentials] Reading shared credentials file.
[2022/01/21 20:13:27] [debug] [aws_credentials] Shared credentials file /root/.aws/credentials does not exist
[2022/01/21 20:13:27] [debug] [aws_credentials] Init called on the EC2 IMDS provider
[2022/01/21 20:13:27] [debug] [aws_credentials] requesting credentials from EC2 IMDS
[2022/01/21 20:13:27] [debug] [http_client] not using http_proxy for header
[2022/01/21 20:13:27] [debug] [http_client] server 169.254.169.254:80 will close connection #50
[2022/01/21 20:13:27] [debug] [aws_client] (null): http_do=0, HTTP Status: 401
[2022/01/21 20:13:27] [debug] [http_client] not using http_proxy for header
  • Steps to reproduce the problem:
    • launch an EKS cluster
    • create an S3 bucket
    • launch fluentbit in EKS, it should work fine using the default config
    • add the S3 configuration but do NOT add the AWS filter configuration
    • the container will enter a restart loop because the Fluentbit HTTP server never starts on port 2020
      Expected behavior
      The HTTP server should start successfully - OR - an error of some kind should be output in the logs.

Screenshots

Your Environment

  • Version used: EKS 1.19 / Fluentbit 1.8.11 / Helm chart fluent-bit-0.19.17
  • Configuration: Default until S3 output is added
  • Environment name and version (e.g. Kubernetes? What version?): AWS / EKS / see above
  • Server type and version: EC2 instances of varying sizes
  • Operating System and version: Amazon Linux
  • Filters and plugins: S3 / Kubernetes

Additional context

I was trying to push logs to S3 for archiving, but the solution wouldn't work.

Caveat: This very likely may be a documentation issue, but it does seem like some code updates would make this more user friendly.

I found, after reading several GitHub issues, a curl command for using IMDSv2. I tried the command inside an ElasticSearch pod in the same cluster and that command hung (the default fluent-bit containers don't have /bin/sh nor /bin/bash). I have not previously made IMDS required on our instances but rather made it Optional.

I then found commentary stating that a fix for IMDSv2 in Fluent-bit 1.8.8 was needed by changing the hop count. I am on 1.8.11 so didn't think this was required (somewhere I saw a comment saying the issue was fixed in 1.8.9), but I changed the setting anyway. Suddenly the pods started working. I then learned about the AWS Filter from a GitHub issue as well and read the docs there.

I noticed that the default setting for AWS is to use IMDSv2 which seems to require a token, but if I have IMDSv2 as optional, the aforementioned curl command when the hop count is 1 just HANGS.

So either a timeout on the IMDSv2 (and add a correspondinginitialDelaySeconds to both the livenessProbe and the readinessProbe to the Helm chart) with an error message and/or update the S3 documentation to state caveats about the IMDS version settings and link to the AWS filter documentation.

If I can help some how, let me know.

@eyespies
Copy link
Author

I just had my cluster scale and I have not updated the launch template to set the hop count, but I did add the AWS filter with IMDS v1 as the setting. So for the logs below:

IMDS tokens are set to Optional
IMDS hop count is 1
The AWS filter config is:

    [FILTER]
        Name aws
        Match *
        imds_version v1
        az true
        ec2_instance_id true
        ec2_instance_type true
        private_ip true
        ami_id false
        account_id true
        hostname true
        vpc_id true

Unfortunately the containers on the new instance still hang on startup, but they have a different log output pattern:

❯ klf -n logging log-reader-fluent-bit-gkvw4
Fluent Bit v1.8.11
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/01/21 21:30:59] [ info] Configuration:
[2022/01/21 21:30:59] [ info]  flush time     | 1.000000 seconds
[2022/01/21 21:30:59] [ info]  grace          | 5 seconds
[2022/01/21 21:30:59] [ info]  daemon         | 0
[2022/01/21 21:30:59] [ info] ___________
[2022/01/21 21:30:59] [ info]  inputs:
[2022/01/21 21:30:59] [ info]      tail
[2022/01/21 21:30:59] [ info]      systemd
[2022/01/21 21:30:59] [ info] ___________
[2022/01/21 21:30:59] [ info]  filters:
[2022/01/21 21:30:59] [ info]      kubernetes.0
[2022/01/21 21:30:59] [ info]      record_modifier.1
[2022/01/21 21:30:59] [ info]      aws.2
[2022/01/21 21:30:59] [ info] ___________
[2022/01/21 21:30:59] [ info]  outputs:
[2022/01/21 21:30:59] [ info]      forward.0
[2022/01/21 21:30:59] [ info]      forward.1
[2022/01/21 21:30:59] [ info]      s3.2
[2022/01/21 21:30:59] [ info]      s3.3
[2022/01/21 21:30:59] [ info] ___________
[2022/01/21 21:30:59] [ info]  collectors:
[2022/01/21 21:30:59] [ info] [engine] started (pid=1)
[2022/01/21 21:30:59] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2022/01/21 21:30:59] [debug] [storage] [cio stream] new stream registered: tail.0
[2022/01/21 21:30:59] [debug] [storage] [cio stream] new stream registered: systemd.1
[2022/01/21 21:30:59] [ info] [storage] version=1.1.5, initializing...
[2022/01/21 21:30:59] [ info] [storage] in-memory
[2022/01/21 21:30:59] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2022/01/21 21:30:59] [ info] [cmetrics] version=0.2.2
[2022/01/21 21:30:59] [ info] [input:tail:tail.0] multiline core started
[2022/01/21 21:30:59] [debug] [input:tail:tail.0] flb_tail_fs_inotify_init() initializing inotify tail input
[2022/01/21 21:30:59] [debug] [input:tail:tail.0] inotify watch fd=24
[2022/01/21 21:30:59] [debug] [input:tail:tail.0] scanning path /var/log/containers/*.log
[2022/01/21 21:30:59] [debug] [input:tail:tail.0] inode=105915169 with offset=1218 appended as /var/log/containers/xxxxxxx
[2022/01/21 21:30:59] [debug] [input:tail:tail.0] scan_glob add(): /var/log/containers/xxxxxxxx
[2022/01/21 21:30:59] [debug] [input:tail:tail.0] inode=11549333 with offset=5184 appended as /var/log/containers/xxxxxxxx
<snip>
[2022/01/21 21:30:59] [debug] [input:tail:tail.0] 18 new files found on path '/var/log/containers/*.log'
[2022/01/21 21:30:59] [debug] [input:systemd:systemd.1] add filter: _SYSTEMD_UNIT=kubelet.service (or)
[2022/01/21 21:30:59] [debug] [input:systemd:systemd.1] jump to the end of journal and skip 1 last entries
[2022/01/21 21:30:59] [debug] [input:systemd:systemd.1] sd_journal library may truncate values to sd_journal_get_data_threshold() bytes: 65536
[2022/01/21 21:30:59] [ info] [filter:kubernetes:kubernetes.0] https=1 host=kubernetes.default.svc port=443
[2022/01/21 21:30:59] [ info] [filter:kubernetes:kubernetes.0] local POD info OK
[2022/01/21 21:30:59] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2022/01/21 21:30:59] [debug] [filter:kubernetes:kubernetes.0] Send out request to API Server for pods information
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [http_client] server kubernetes.default.svc:443 will close connection #46
[2022/01/21 21:30:59] [debug] [filter:kubernetes:kubernetes.0] Request (ns=logging, pod=log-reader-fluent-bit-gkvw4) http_do=0, HTTP Status: 200
[2022/01/21 21:30:59] [ info] [filter:kubernetes:kubernetes.0] connectivity OK
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] Using IMDSv1
[2022/01/21 21:30:59] [debug] [http_client] server 169.254.169.254:80 will close connection #46
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] IMDS metadata request http_do=0, HTTP Status: 200
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] Using IMDSv1
[2022/01/21 21:30:59] [debug] [http_client] server 169.254.169.254:80 will close connection #46
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] IMDS metadata request http_do=0, HTTP Status: 200
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] Using IMDSv1
[2022/01/21 21:30:59] [debug] [http_client] server 169.254.169.254:80 will close connection #46
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] IMDS metadata request http_do=0, HTTP Status: 200
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] Using IMDSv1
[2022/01/21 21:30:59] [debug] [http_client] server 169.254.169.254:80 will close connection #46
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] IMDS metadata request http_do=0, HTTP Status: 200
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] Using IMDSv1
[2022/01/21 21:30:59] [debug] [http_client] server 169.254.169.254:80 will close connection #46
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] IMDS metadata request http_do=0, HTTP Status: 200
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] Using IMDSv1
[2022/01/21 21:30:59] [debug] [http_client] server 169.254.169.254:80 will close connection #46
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] IMDS metadata request http_do=0, HTTP Status: 200
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] Using IMDSv1
[2022/01/21 21:30:59] [debug] [http_client] server 169.254.169.254:80 will close connection #46
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] IMDS metadata request http_do=0, HTTP Status: 200
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] Using IMDSv1
[2022/01/21 21:30:59] [debug] [http_client] server 169.254.169.254:80 will close connection #46
[2022/01/21 21:30:59] [debug] [filter:aws:aws.2] IMDS metadata request http_do=0, HTTP Status: 200
[2022/01/21 21:30:59] [debug] [forward:forward.0] created event channels: read=46 write=48
[2022/01/21 21:30:59] [debug] [forward:forward.1] created event channels: read=49 write=50
[2022/01/21 21:30:59] [debug] [s3:s3.2] created event channels: read=51 write=52
[2022/01/21 21:30:59] [ info] [fstore] created root path /tmp/fluent-bit/s3/<snip>
[2022/01/21 21:30:59] [debug] [fstore] [cio scan] opening path /tmp/fluent-bit/s3/<snip>
[2022/01/21 21:30:59] [debug] [fstore] created stream path /tmp/fluent-bit/s3/<snip>
[2022/01/21 21:30:59] [debug] [fstore] [cio stream] new stream registered: 2022-01-21T21:30:59
[2022/01/21 21:30:59] [debug] [fstore] created stream path /tmp/fluent-bit/s3/<snip>
[2022/01/21 21:30:59] [debug] [fstore] [cio stream] new stream registered: multipart_upload_metadata
[2022/01/21 21:30:59] [ info] [output:s3:s3.2] Using upload size 250000000 bytes
[2022/01/21 21:30:59] [debug] [aws_credentials] Initialized Env Provider in standard chain
[2022/01/21 21:30:59] [debug] [aws_credentials] Initialized AWS Profile Provider in standard chain
[2022/01/21 21:30:59] [debug] [aws_credentials] Not initializing EKS provider because AWS_ROLE_ARN was not set
[2022/01/21 21:30:59] [debug] [aws_credentials] Not initializing ECS Provider because AWS_CONTAINER_CREDENTIALS_RELATIVE_URI is not set
[2022/01/21 21:30:59] [debug] [aws_credentials] Initialized EC2 Provider in standard chain
[2022/01/21 21:30:59] [debug] [aws_credentials] Sync called on the STS provider
[2022/01/21 21:30:59] [debug] [aws_credentials] Sync called on the EC2 provider
[2022/01/21 21:30:59] [debug] [aws_credentials] Init called on the STS provider
[2022/01/21 21:30:59] [debug] [aws_credentials] Init called on the env provider
[2022/01/21 21:30:59] [debug] [aws_credentials] Init called on the profile provider
[2022/01/21 21:30:59] [debug] [aws_credentials] Reading shared config file.
[2022/01/21 21:30:59] [debug] [aws_credentials] Shared config file /root/.aws/config does not exist
[2022/01/21 21:30:59] [debug] [aws_credentials] Reading shared credentials file.
[2022/01/21 21:30:59] [debug] [aws_credentials] Shared credentials file /root/.aws/credentials does not exist
[2022/01/21 21:30:59] [debug] [aws_credentials] Init called on the EC2 IMDS provider
[2022/01/21 21:30:59] [debug] [aws_credentials] requesting credentials from EC2 IMDS
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:30:59] [debug] [http_client] server 169.254.169.254:80 will close connection #53
[2022/01/21 21:30:59] [debug] [aws_client] (null): http_do=0, HTTP Status: 401
[2022/01/21 21:30:59] [debug] [http_client] not using http_proxy for header
[2022/01/21 21:32:09] [engine] caught signal (SIGTERM)
[2022/01/21 21:32:09] [ info] [input] pausing tail.0
[2022/01/21 21:32:09] [ info] [input] pausing systemd.1
<snip>
[2022/01/21 21:32:09] [debug] [input:tail:tail.0] inode=113259156 removing file name /var/log/containers/xxxxx
[2022/01/21 21:32:09] [debug] [input:tail:tail.0] inode=32508021 removing file name /var/log/containers/xxxxx

@eyespies
Copy link
Author

One other comment. I used the exact same S3 OUTPUT config, with no AWS filter, in the same cluster running Fluent Bit 1.8.1 (via New Relic install) and it works completely fine.

@Throckmortra
Copy link

Also seeing this in EKS. I've downgraded fluentbit significantly to avoid this.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label May 17, 2022
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants