Skip to content

App stops consuming from SQS and becomes idle without any exception #2250

@realradical

Description

@realradical

We have a bizarre issue. Our app is hosted on ECS. What it does is pulling messages from a SQS queue and send them to our Kafka cluster. We usually have 20 tasks running. And from time to time, a container will become idle meaning CPU usage is close to 0 and SQS consume rate is 0, but the container is still running without any exception. We even enabled DEBUG logging option for AWS SDK. It is not printing any error. The last log stopped at the time as soon as CPU drops to 0.

[2021-01-18 07:16:44,558] [aws-java-sdk-NettyEventLoop-0-6] DEBUG software.amazon.awssdk.request - Received successful response: 200

You can check the below screenshot for the most recent issue. This container stopped consuming message at 7:16AM UTC.
image

Describe the bug

Our app is a "reactive streaming" app built with fs2 streaming framework (see doc https://fs2.io/). It uses "software.amazon.awssdk.services.sqs.SqsAsyncClient" to continuously poll messages from Sqs queue. The polled messages become the source of the stream. The stream processes each massage and sends them to a Kafka topic. Therefore, it is a non-stopping process.

Our issue is that there is no exception caught in our app. Our app can catch the exception and terminate the stream, which results in ECS task restarting. However, Aws SDK does not throw an exception in this case, not even useful debug logging.

Expected Behavior

In the case of SqsAsyncClient stops consuming messages, SDK should throw an exception with a detailed explanation.

Current Behavior

We even enabled DEBUG logging option for AWS SDK. It is not printing any error. The last log stopped at the time as soon as CPU drops to 0.
[2021-01-18 07:16:44,558] [aws-java-sdk-NettyEventLoop-0-6] DEBUG software.amazon.awssdk.request - Received successful response: 200

Your Environment

  • AWS Java SDK version used: 2.15.59
  • JDK version used: openjdk:11-jre-slim
  • Operating System and version: linux

Metadata

Metadata

Assignees

Labels

bugThis issue is a bug.closed-for-stalenessresponse-requestedWaiting on additional info and feedback. Will move to "closing-soon" in 10 days.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions