Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed messages that are close to the max body limit are retried endlessly #991

Closed
danielmarbach opened this issue Mar 25, 2023 · 0 comments · Fixed by #1002
Closed

Failed messages that are close to the max body limit are retried endlessly #991

danielmarbach opened this issue Mar 25, 2023 · 0 comments · Fixed by #1002
Labels
Milestone

Comments

@danielmarbach
Copy link
Contributor

danielmarbach commented Mar 25, 2023

Describe the bug

Description

This is a tricky situation, and we might not be able to improve the situation fully. This happens when we receive a message that is close to the ASQ message size limit of 64 KB. When we receive such a large message and either send it to the audit queue or send it to the error queue (due to an exception), we enrich the message with more headers. Once that happens, we reach the body size limit and then that messages are indefinitely retried. Because there is no delay and it also cannot be handled by the delayed delivery infrastructure, such a message ends up churning away significant resources of the endpoint which might even cause the endpoint to completely stall.

We already have solutions like the data bus but the problem is once you got such a message thing problematic and you have to somehow manually receive messages from the queue until you reach the poison message to delete it with the corresponding pop receipt while requeue-ing all other messages.

Expected behavior

The message is sent to the correct queue without being indefinitely retried.

Actual behavior

As the message is decorated with headers, there by pushing it over the message limit, the message gets indefinitely retried. This results in consuming significant resources.

Versions

Version 12.0.0, 11.0.0, 10.0.4

Steps to reproduce

  • Send a very large message (very close to the ASQ message size limit)
  • Throw an exception so that the message is decorated with additional headers and sent to error queue
  • Notice that the message does not reach the error queue but gets in a loop of indefinite retries.

Relevant log output

None

Additional information

Describe the suggested solution

Azure storage queue messages have a size limit of 64 KB. When a message that is very close to this size limit is received and is being sent to the audit queue or error queue, the message is enriched with more headers. This pushes the size of the message to exceed the azure storage queue size limit and the message ends up being indefinitely retried. In an attempt to try to send such a message to the error queue, the message is unwrapped and rewrapped with minimal headers - FailedQ and ExceptionType. The FailedQ header is required by ServiceControl or it will end up being a failed error import. The ExceptionType is highly desirable as it will allow grouping. The failure group view by exception type is the default failed message view in ServicePulse, which is the common way for users to interact with failed messages. If this attempt also fails, the message is sent to the error queue without any headers.

Describe alternatives you've considered

Backported to

@soujay soujay added the Feature label Jun 1, 2023
@soujay soujay added this to the 12.0.1 milestone Jun 1, 2023
@soujay soujay added Bug and removed Feature labels Jun 6, 2023
@soujay soujay changed the title Improve handling of messages that are close to the max body limit Prevent indefinite retries when messages fail that are close to the max body limit Jun 6, 2023
@soujay soujay changed the title Prevent indefinite retries when messages fail that are close to the max body limit Failed messages that are close to the max body limit are retried endlessly Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants