Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poison messages handling #1040

Open
saguiitay opened this issue Feb 11, 2024 · 1 comment
Open

Poison messages handling #1040

saguiitay opened this issue Feb 11, 2024 · 1 comment
Labels

Comments

@saguiitay
Copy link
Contributor

As a rokochet of issue #1039 I've noticed that the message is being handled indefinitely.
I see the following traces in my logs:

PoisonMessageDetected: d8bdcf4d-ef40-4501-a99b-6e72309644ee: Message [TaskScheduled#1] with ID 6128efd4-8904-494d-9a2d-d2d8d01edbe7 has been dequeued 79 times and is now considered poison: {"Account":"...","TaskHub":"Services","...":"TaskScheduled","TaskEventId":1,"MessageId":"6128efd4-8904-494d-9a2d-d2d8d01edbe7","InstanceId":"d8bdcf4d-ef40-4501-a99b-6e72309644ee","ExecutionId":"b27db20915cf4e5db6cad3589dde88df","PartitionId":"...-workitems","DequeueCount":79}
AbandoningMessage: d8bdcf4d-ef40-4501-a99b-6e72309644ee: Abandoning [TaskScheduled#1] message back to ...-workitems and setting a visibility delay of 600ms: {"Account":"...","TaskHub":"...","EventType":"TaskScheduled","TaskEventId":1,"MessageId":"6128efd4-8904-494d-9a2d-d2d8d01edbe7","InstanceId":"d8bdcf4d-ef40-4501-a99b-6e72309644ee","ExecutionId":"b27db20915cf4e5db6cad3589dde88df","PartitionId":"services-workitems","SequenceNumber":208,"PopReceipt":"AgAAAAMAAAAAAAAAxVkNR+xc2gE=","VisibilityTimeoutSeconds":600}

Side note: the message say setting a visibility delay of 600ms, while the code actually delays for 600s (not ms).

Notice the DequeueCount value of 79 (I've seen values much higher). Perhaps there should be a setting that controls the maximum number of dequeues attempts, after which the message should just be disposed?

@cgillum @davidmrdavid

@cgillum
Copy link
Collaborator

cgillum commented Feb 29, 2024

I think it makes sense to expose this as a setting. The reason that we allow it to keep going by default is to avoid data loss and allow users a chance to fix the root cause, but it makes sense to allow overriding this behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants