Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.7 dev] AWS Plugins can not connect to AWS services #2927

Closed
PettitWesley opened this issue Jan 9, 2021 · 12 comments
Closed

[1.7 dev] AWS Plugins can not connect to AWS services #2927

PettitWesley opened this issue Jan 9, 2021 · 12 comments
Assignees
Labels
AWS Issues with AWS plugins or experienced by users running on AWS bug fixed

Comments

@PettitWesley
Copy link
Contributor

Master only, not released

Requests to S3 do not seem to work:

[2021/01/08 23:15:01] [error] [aws_client] connection initialization error
[2021/01/08 23:15:01] [error] [output:s3:s3.0] PutObject request failed

Which comes from here: https://github.com/fluent/fluent-bit/blob/master/src/aws/flb_aws_util.c#L273

I see the same errors in all AWS plugins:

[2021/01/09 00:30:55] [error] [aws_client] connection initialization error
[2021/01/09 00:30:55] [error] [output:kinesis_firehose:kinesis_firehose.2] Failed to send log records to fluentd-service1
[2021/01/09 00:30:55] [error] [output:kinesis_firehose:kinesis_firehose.2] Failed to send log records
[2021/01/09 00:30:55] [error] [output:kinesis_firehose:kinesis_firehose.2] Failed to send records
[2021/01/09 00:30:55] [error] [aws_client] connection initialization error
@PettitWesley PettitWesley changed the title [1.7 dev] [1.7 dev] AWS Plugins can not connect to AWS services Jan 9, 2021
@edsiper
Copy link
Member

edsiper commented Jan 16, 2021

@PettitWesley can you try a new fresh build from GIT master (using the latest changes) ?

@edsiper edsiper self-assigned this Jan 16, 2021
@edsiper edsiper added the waiting-for-user Waiting for more information, tests or requested changes label Jan 16, 2021
@PettitWesley
Copy link
Contributor Author

As of yesterday, we saw the same issue.

CC @zhonghui12

@PettitWesley PettitWesley added AWS Issues with AWS plugins or experienced by users running on AWS and removed waiting-for-user Waiting for more information, tests or requested changes labels Jan 21, 2021
@zhonghui12
Copy link
Contributor

Hi @edsiper, we still face the same issue:

[error] [aws_client] connection initialization error.

@edsiper
Copy link
Member

edsiper commented Jan 26, 2021

I am trying to reproduce this locally.

@PettitWesley @zhonghui12

Would you please instruct me how to setup the authentication properly using the following config snippet ?

[SERVICE]
    flush     1
    log_level info

[INPUT]
    name    cpu
    
[OUTPUT]
    Name                  s3
    Match                 *
    role_arn              arn:aws:s3:us-east-2:000xx00:accesspoint/00xx00
    bucket                test2927
    total_file_size       1M
    upload_timeout        1m
    use_put_object        On

@edsiper
Copy link
Member

edsiper commented Jan 27, 2021

original issue detailed on #2973, fixed by #2974

@zerthimon
Copy link

zerthimon commented Feb 15, 2021

Having this issue with fluent/fluent-bit:1.7 image

REPOSITORY                                         TAG       IMAGE ID       CREATED       SIZE
fluent/fluent-bit                                  1.7       fbb436b23d79   9 hours ago   78.4MB
[2021/02/15 11:32:14] [debug] [es:es.0] created event channels: read=43 write=44
[2021/02/15 11:32:14] [debug] [out_es] Enabled AWS Auth
[2021/02/15 11:32:14] [debug] [aws_credentials] Initialized Env Provider in standard chain
[2021/02/15 11:32:14] [debug] [aws_credentials] Initialized AWS Profile Provider in standard chain
[2021/02/15 11:32:14] [debug] [aws_credentials] Initialized EKS Provider in standard chain
[2021/02/15 11:32:14] [debug] [aws_credentials] Not initializing ECS Provider because AWS_CONTAINER_CREDENTIALS_RELATIVE_URI is not set
[2021/02/15 11:32:14] [debug] [aws_credentials] Initialized EC2 Provider in standard chain
[2021/02/15 11:32:14] [debug] [aws_credentials] Sync called on the EKS provider
[2021/02/15 11:32:14] [debug] [aws_credentials] Sync called on the EC2 provider
[2021/02/15 11:32:14] [debug] [aws_credentials] Init called on the env provider
[2021/02/15 11:32:14] [debug] [aws_credentials] Init called on the profile provider
[2021/02/15 11:32:14] [debug] [aws_credentials] Reading shared credentials file..
[2021/02/15 11:32:14] [debug] [aws_credentials] Could not read shared credentials file /root/.aws/credentials
[2021/02/15 11:32:14] [debug] [aws_credentials] Init called on the EKS provider
[2021/02/15 11:32:14] [debug] [aws_credentials] Calling STS..
[2021/02/15 11:32:14] [debug] [upstream] connection #45 failed to sts.us-east-1.amazonaws.com:443
[2021/02/15 11:32:14] [debug] [aws_client] connection initialization error
[2021/02/15 11:32:14] [debug] [aws_credentials] STS assume role request failed

Works with 1.6.10

[2021/02/15 11:44:51] [debug] [out_es] Enabled AWS Auth
[2021/02/15 11:44:51] [debug] [aws_credentials] Initialized Env Provider in standard chain
[2021/02/15 11:44:51] [debug] [aws_credentials] Initialized AWS Profile Provider in standard chain
[2021/02/15 11:44:51] [debug] [aws_credentials] Initialized EKS Provider in standard chain
[2021/02/15 11:44:51] [debug] [aws_credentials] Not initializing ECS Provider because AWS_CONTAINER_CREDENTIALS_RELATIVE_URI is not set
[2021/02/15 11:44:51] [debug] [aws_credentials] Initialized EC2 Provider in standard chain
[2021/02/15 11:44:51] [debug] [aws_credentials] Sync called on the EKS provider
[2021/02/15 11:44:51] [debug] [aws_credentials] Sync called on the EC2 provider
[2021/02/15 11:44:51] [debug] [aws_credentials] Init called on the env provider
[2021/02/15 11:44:51] [debug] [aws_credentials] Init called on the profile provider
[2021/02/15 11:44:51] [debug] [aws_credentials] Reading shared credentials file..
[2021/02/15 11:44:51] [debug] [aws_credentials] Could not read shared credentials file /root/.aws/credentials
[2021/02/15 11:44:51] [debug] [aws_credentials] Init called on the EKS provider
[2021/02/15 11:44:51] [debug] [aws_credentials] Calling STS..
[2021/02/15 11:44:51] [debug] [http_client] not using http_proxy for header
[2021/02/15 11:44:51] [debug] [http_client] header=GET /?Version=2011-06-15&Action=AssumeRoleWithWebIdentity&RoleSessionName=[REDACTED] HTTP/1.1
Host: sts.us-east-1.amazonaws.com
Content-Length: 0
User-Agent: aws-fluent-bit-plugin


[2021/02/15 11:44:51] [debug] [upstream] KA connection #46 to sts.us-east-1.amazonaws.com:443 is now available

@TDanielsHL
Copy link

TDanielsHL commented Apr 22, 2021

Documentation on credentials / authentication / authorization seems to be very light or not altogether there:
https://docs.fluentbit.io/manual/pipeline/outputs/s3

I'm having similar issues as described above using IRSA on an EKS Cluster. Pods are annotated with what Amazon expects to provision the environment with the AWS_ROLE_ARN and WEB_IDENTITY_TOKEN_FILE, but that doesn't seem to be picked up by the plugin without additional configuration. Those values seem to be for the AWS CLI as well.

Using Helm Charts to install Fluent-Bit 1.7.2 distroless.

Edit: Scanning the code base does show that those environment variables are within consideration; I'm investigating if it's something tied to permissions & restrictions I'm unaware of in the AWS Account. But documentation would still be helpful!

@PettitWesley
Copy link
Contributor Author

@TDanielsHL There's no documentation because the AWS Fluent Bit plugins are supposed to support IAM Roles for Service accounts, and all other standard methods for retrieving AWS credentials. It should work in any setup where any tools using one of the standard AWS SDKs would work. The code in fluent bit is not a standard AWS SDK, its custom, but it's meant to be identical in behavior.

May be we could add a note in the docs which states that and lists the standard order of resolution for credential sources. I wish the official AWS documentation had some sort of nice explainer on all the standard credential sources and how each one works. Then we could just link to that.

If you think you've found a bug here, give us more details.

@TDanielsHL
Copy link

@PettitWesley Thank you for the response. I don't think it's a bug with the plugin after digging in and getting debug logging working. We did have to dig to find documentation about credentials and how it is sourced; if things work out of the box (and they can), all's well, but if it doesn't, it's helpful to have handy.

I did see the order of operations on credential sourcing, and we believe our IRSA problems are related to the AWS Account rather than the plugin. Still, not to beat this over the head, how the plugin ties into these systems if things don't work out of the box would be Nice to Have.

@michaelm-88
Copy link

@TDanielsHL @PettitWesley

are you able to help me with this one please

 [2021/06/29 13:05:25] [ info] [output:cloudwatch_logs:cloudwatch_logs.0] Creating log group "/aws/containerinsights/dev/dataplane"                                                                                                                                                                                │
│ [2021/06/29 13:05:25] [debug] [http_client] not using http_proxy for header                                                                                                                                                                                                                                                │
│ [2021/06/29 13:05:25] [debug] [aws_credentials] Requesting credentials from the EKS provider..                                                                                                                                                                                                                             │
│ [2021/06/29 13:05:25] [debug] [http_client] header=POST / HTTP/1.1                                                                                                                                                                                                                                                         │
│ Host: logs.us-east-1.amazonaws.com                                                                                                                                                                                                                                                                                         │
│ Content-Length: 66                                                                                                                                                                                                                                                                                                         │
│ User-Agent: aws-fluent-bit-plugin                                                                                                                                                                                                                                                                                          │
│ Content-Type: application/x-amz-json-1.1                                                                                                                                                                                                                                                                                   │
│ X-Amz-Target: Logs_20140328.CreateLogGroup                                                                                                                                                                                                                                                                                 │
│ x-amz-date: 20210629T130525Z                                                                                                                                                                                                                                                                                               │
│ x-amz-security-token: I                                │
│                                                                                                                                                                                                                                                                                                                            │
│ [2021/06/29 13:05:25] [debug] [http_client] server logs.us-east-1.amazonaws.com:443 will close connection #62                                                                                                                                                                                                              │
│ [2021/06/29 13:05:25] [debug] [aws_client] logs.us-east-1.amazonaws.com: http_do=0, HTTP Status: 400                                                                                                                                                                                                                       │
│ [2021/06/29 13:05:25] [debug] [output:cloudwatch_logs:cloudwatch_logs.0] CreateLogGroup http status=400                                                                                                                                                                                                                    │
│ [2021/06/29 13:05:25] [error] [output:cloudwatch_logs:cloudwatch_logs.0] CreateLogGroup API responded with error='SerializationException'                                                                                                                                                                                  │
│ [2021/06/29 13:05:25] [error] [output:cloudwatch_logs:cloudwatch_logs.0] Failed to create log group                                                                                                                                                                                                                        │
│ [2021/06/29 13:05:25] [debug] [socket] could not validate socket status for #62 (don't worry)                                                                                                                                                                                                                              │
│ [2021/06/29 13:05:25] [debug] [out coro] cb_destroy coro_id=12                                                                                                                                                                                                                                                             │
│ [2021/06/29 13:05:25] [debug] [retry] new retry created for task_id=8 attempts=1

@PettitWesley
Copy link
Contributor Author

@michaelm-88 I think its the quotes in your group name.

@bgarcial
Copy link

bgarcial commented Mar 27, 2024

Hi @PettitWesley, @michaelm-88 @TDanielsHL
I got a similar issue when deploying fluentbit from a helm chart, and configuring the deployment to use iam role for service accounts, in my case to export logs to AWS OpenSearch service and I also got: STS assume role request failed but is because the fluentbit pods looks for the token on the /var/run/secrets/eks.amazonaws.com/serviceaccount/aws-iam-token path but the env variable injected is AWS_WEB_IDENTITY_TOKEN_FILE : /var/run/secrets/eks.amazonaws.com/serviceaccount/token and that is why it does not find the token to assume the role. I've described the problem here

This is also happening when using aws-for-fluent-bit helm chart
I have been checking similar issues on those repos and in the community, but is not clear how to solve it.
You guys also mention something about a possible bug and fixed on fluentbit 1.6.2 version, but it seems it is still appearing on the 2.2.2 version I am using.
Do you guys have an idea how to overcome this issue?
I will appreciate your inputs!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AWS Issues with AWS plugins or experienced by users running on AWS bug fixed
Projects
None yet
Development

No branches or pull requests

7 participants