-
Notifications
You must be signed in to change notification settings - Fork 20
Description
When the SQS consumer encounters an error, the consumer loop exits permanently and never restarts. The service appears healthy (container running, HTTP responding) but stops processing all messages from the queue. This causes silent failures where new messages accumulate but are never processed until someone manually restarts the service.
We experienced this when a message with malformed data caused an unmarshal error (json: cannot unmarshal string into Go struct field PublishedEvent.data of type map[string]interface{}). The handler attempted to Nack the message, which failed due to missing sqs:ChangeMessageVisibility IAM permission. This caused the consumer to exit silently and the service appeared healthy, but stopped processing messages until manual restart.
To Reproduce
- Deploy Outpost with AWS SQS but missing
sqs:ChangeMessageVisibilitypermission - Message arrives that causes any handler error (malformed data, processing failure, etc.)
- Handler attempts to Nack the failed message
- GoCloud driver tries to call
ChangeMessageVisibilityBatch - AWS returns 403 permission denied
- Error bubbles up from consumer.Run() to startPublishMQConsumer (api/api.go:253)
- Consumer goroutine logs error and exits permanently
- Service continues running, appears healthy
- All subsequent messages accumulate unprocessed with no alerts or visible indication
Metadata
Metadata
Assignees
Labels
Type
Projects
Status