Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry AWS credential load errors #9291

Open
jszwedko opened this issue Sep 21, 2021 · 2 comments
Open

Retry AWS credential load errors #9291

jszwedko opened this issue Sep 21, 2021 · 2 comments
Labels
provider: aws Anything `aws` service provider related type: bug A code related bug.

Comments

@jszwedko
Copy link
Member

jszwedko commented Sep 21, 2021

A user reported that Vector was failing to start because it couldn't load AWS credentials in the aws_s3 sink. They run a proxy sidecar for these credential requests, and Vector was starting up before the sidecar does. It seems reasonable to retry loading of credentials indefinitely.

Sep 21 17:16:19.191 ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=Failed creating AWS credentials. Errors: [CredentialsError { message: "Error during dispatch: error trying to connect: tcp connect error: Connection refused (os error 111)" }, CredentialsError { message: "Couldn't find AWS credentials in environment, credentials file, or IAM role." }] component_kind="sink" component_type="aws_s3" component_id=s3_systemd component_name=s3_systemd
@jszwedko jszwedko added type: bug A code related bug. provider: aws Anything `aws` service provider related labels Sep 21, 2021
@aladdin-atypon
Copy link

We have the same issue but this time for SQS,

sink{component_id=sqs component_kind="sink" component_type=aws_sqs component_name=sqs}:request{request_id=74063}: vector::sinks::util::retries: Non-retriable error; dropping the request. error=Failed creating AWS credentials. Errors: [CredentialsError { message: "environment variable not found" }, CredentialsError { message: "Couldn't find AWS credentials in environment, credentials file, or IAM role." }]

Given that the access to SQS is granted from AWS IAM.

@joseluisjimenez1
Copy link

joseluisjimenez1 commented Apr 4, 2024

We've kind of same issue with Elasticsearch sink connected to AWS Opensearch.

We notice that time to time and at some point in time when log volume spikes, credential provider fails and vector drops messages even end to end acknowledgments are enabled, so missing the contract about deliver with guarantee.

When enabled debug logs, I notice that credentials were loaded every single second, is that a normal behaviour? Maybe we can specify a longer credential expiration time when use the aws-sdk?

Notes:

  • Vector version 0.37.0
  • Running at AWS ECS Fargate using EcsContainer credential provider.
  • Reading from Kafka Source
  • Tested with imds auth configuration increase timeouts and max_attempts to try to avoid this error, but still same behave:
    auth:
      strategy: "aws"
      load_timeout_secs: 1200 
      imds:
        max_attempts: 60
        connect_timeout_seconds: 10
        read_timeout_seconds: 10
    

In order to not duplicate a lot, I create this support question at discord: discord thread

Example of logs:

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   timestamp   |                                                                                                                                                                                                                                                                          message                                                                                                                                                                                                                                                                           |
|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|                                                                                                                                                                                                                                                                                                                                                                                                                 |
| 1711549083590 | 2024-03-27T14:18:03.590844Z  WARN sink{component_kind="sink" component_id=opensearch component_type=elasticsearch}:request{request_id=4418}: aws_config::meta::credentials::chain: provider failed to provide credentials provider=EcsContainer error=unexpected credentials error: dispatch failure: other: connection closed before message completed (Unhandled(Unhandled { source: DispatchFailure(DispatchFailure { source: ConnectorError { kind: Other(Some(TransientError)), source: hyper::Error(IncompleteMessage), connection: Unknown } }) })) |
| 1711549083591 | 2024-03-27T14:18:03.591030Z ERROR sink{component_kind="sink" component_id=opensearch component_type=elasticsearch}:request{request_id=4418}: vector::internal_events::common: Failed to build request. error=unexpected credentials error error_type="encoder_failed" stage="processing" internal_log_rate_limit=true                                                                                                                                                                                                                                      |
| 1711549083591 | 2024-03-27T14:18:03.591146Z  WARN sink{component_kind="sink" component_id=opensearch component_type=elasticsearch}:request{request_id=4418}: vector::sinks::util::adaptive_concurrency::controller: Unhandled error response. error=unexpected credentials error internal_log_rate_limit=true                                                                                                                                                                                                                                                              |
| 1711549083591 | 2024-03-27T14:18:03.591218Z ERROR sink{component_kind="sink" component_id=opensearch component_type=elasticsearch}:request{request_id=4418}: vector::sinks::util::retries: Unexpected error type; dropping the request. error=unexpected credentials error internal_log_rate_limit=true                                                                                                                                                                                                                                                                    |
| 1711549083591 | 2024-03-27T14:18:03.591354Z ERROR sink{component_kind="sink" component_id=opensearch component_type=elasticsearch}:request{request_id=4418}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=Some(Unhandled(Unhandled { source: DispatchFailure(DispatchFailure { source: ConnectorError { kind: Other(Some(TransientError)), source: hyper::Error(IncompleteMessage), connection: Unknown } }) })) request_id=4418 error_type="request_failed" stage="sending" internal_log_rate_limit=true           |
| 1711549083591 | 2024-03-27T14:18:03.591722Z ERROR sink{component_kind="sink" component_id=opensearch component_type=elasticsearch}:request{request_id=4418}: vector_common::internal_event::component_events_dropped: Events dropped intentional=false count=1007 reason="Service call failed. No retries or retries exhausted." internal_log_rate_limit=true                                                                                                                                                                                                              |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
provider: aws Anything `aws` service provider related type: bug A code related bug.
Projects
None yet
Development

No branches or pull requests

3 participants