Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Sqs fifo message group invisibility after triggering move to dlq #10859

Merged
merged 4 commits into from
May 23, 2024

Conversation

baermat
Copy link
Member

@baermat baermat commented May 21, 2024

Motivation

Users raised the issue that fifo queues become unusable after messages were moved to a DLQ.
After a fifo queue's message's receive count exceeds the maximum and is moved to a DLQ, its message group remains invisible. This PR addresses this issue.
fixes #10107

Changes

  • add method to allow the provider to call the missing visibility changes to the message group after moving the message

I am not happy with how this change is implemented, but I am unsure what the best way is.
The root cause is that we simply never addressed the message group visibility when a dlq move is triggered, so obviously we need to address that. There are two main ways to do this that I see:

  • Provider calls the necessary code in the "move to dlq" block directly. That's what's happening now
    • Right now it contains an ugly isinstance(queue,FifoQueue) check, and the provider shouldn't care what kind of message it handles, the point of the different classes is to abstract that away.
    • We could also leave the current method as is and simply call queue._on_remove_message, to get rid of the Fifo check. But this method seems to actually be "queue._on_delete_message" from a program-logic point of view and is expected to be called during deletion. Moving a message to a dlq is very similar, but not quite the same, and this approach would result in e.g. a re-heapifying for a standard queue every time (that's the reason why this solution was discarded).
  • Since moving the message to a dlq is a put operation, we could do a "if dlq move, unlock source queue message group" during the put to the dlq. This options also has the advantage that we can add this message group visibility related change to the place in the code where the other checks of this kind are. However we would need the source queue object for that:
    • We could pass the source queue object on the put operation as optional parameter. This means changing the signature of put for standard_queues too though, as well as passing it all the way down from the provider, and that doesn't seem that clean either.
    • We already pass the DeadLetterQueueSourceArn with the message, and could get the queue from said arn. This way, we would have the change exactly where it belongs (in my opinion) without touching other code from a program-logic point of view. However, the queues are not context aware, and we would need that to access the proper queue from the store. This comes with some medium sized refactoring and is therefore quite the architectural change, which was the reason why this was discarded.
      Sorry for the wall of text, if something is unclear please ask ahead. The reason for it is that message group visibility has been quite the back and forth with patching so far, so I'd like to make extra sure that we do this the right way.

@baermat baermat changed the title Sqs fifo message grp fix Sqs fifo message group invisibility after triggering move to dlq May 21, 2024
@baermat baermat added the semver: patch Non-breaking changes which can be included in patch releases label May 21, 2024
@baermat baermat marked this pull request as ready for review May 21, 2024 16:37
@baermat baermat requested a review from thrau as a code owner May 21, 2024 16:37
Copy link

LocalStack Community integration with Pro

    2 files      2 suites   1h 39m 56s ⏱️
2 981 tests 2 676 ✅ 305 💤 0 ❌
2 983 runs  2 676 ✅ 307 💤 0 ❌

Results for commit 2c06c4b.

Copy link
Member

@thrau thrau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! nice catch 👌

Comment on lines +1236 to +1238
if isinstance(queue, FifoQueue):
message_group = queue.get_message_group(standard_message.message_group_id)
queue.update_message_group_visibility(message_group)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i really regret not designing this in a way that we can easily do DLQ processing in the SQS abstraction, and have to do this on API level. well 🤷

@thrau thrau merged commit bd1a23a into master May 23, 2024
67 of 69 checks passed
@thrau thrau deleted the sqs-fifo-messageGrpID branch May 23, 2024 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: patch Non-breaking changes which can be included in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: can't receive message from fifo queue with the same MessageGroupId
2 participants