Aggressive Retry Behavior with Non-Existent SQS Queue for S3 Source #20428
Labels
domain: reliability
Anything related to Vector's reliability
source: aws_s3
Anything `aws_s3` source related
type: bug
A code related bug.
A note for the community
Problem
When the queue for an
aws_s3
source is deleted, the source retries at an extremely high frequency seemingly without any backoff. This has separately exceeded the limits of an AWS VPC DNS resolver with lookups and once that potential issue was addressed, it instead exceeded the account-level AWS SQS API limits that cannot be increased.The retry frequency ends up sufficient to elevate the service's CPU triggering upward scaling, which just further increases the volume until the upper autoscaling limit is hit.
We would expect the behavior to instead be similar to when the queue is just empty or if a special case is needed, we would expect there to be a randomized, exponential backoff, with an eventual retry limit.
Configuration
Version
0.29.1
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response
The text was updated successfully, but these errors were encountered: