-
Notifications
You must be signed in to change notification settings - Fork 943
Description
We have a bizarre issue. Our app is hosted on ECS. What it does is pulling messages from a SQS queue and send them to our Kafka cluster. We usually have 20 tasks running. And from time to time, a container will become idle meaning CPU usage is close to 0 and SQS consume rate is 0, but the container is still running without any exception. We even enabled DEBUG logging option for AWS SDK. It is not printing any error. The last log stopped at the time as soon as CPU drops to 0.
[2021-01-18 07:16:44,558] [aws-java-sdk-NettyEventLoop-0-6] DEBUG software.amazon.awssdk.request - Received successful response: 200
You can check the below screenshot for the most recent issue. This container stopped consuming message at 7:16AM UTC.
Describe the bug
Our app is a "reactive streaming" app built with fs2 streaming framework (see doc https://fs2.io/). It uses "software.amazon.awssdk.services.sqs.SqsAsyncClient" to continuously poll messages from Sqs queue. The polled messages become the source of the stream. The stream processes each massage and sends them to a Kafka topic. Therefore, it is a non-stopping process.
Our issue is that there is no exception caught in our app. Our app can catch the exception and terminate the stream, which results in ECS task restarting. However, Aws SDK does not throw an exception in this case, not even useful debug logging.
Expected Behavior
In the case of SqsAsyncClient stops consuming messages, SDK should throw an exception with a detailed explanation.
Current Behavior
We even enabled DEBUG logging option for AWS SDK. It is not printing any error. The last log stopped at the time as soon as CPU drops to 0.
[2021-01-18 07:16:44,558] [aws-java-sdk-NettyEventLoop-0-6] DEBUG software.amazon.awssdk.request - Received successful response: 200
Your Environment
- AWS Java SDK version used: 2.15.59
- JDK version used: openjdk:11-jre-slim
- Operating System and version: linux