-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete message after saving message body to S3 #43
Conversation
SQS messages should be available until recoverability is ensured. We have to care about the visibility timeout. 1. Receive a message from MessageQueue 2. Dispatch the message to MessageHandler::XXX 3. Create a record for execution 4. Save the message body to S3 (if necessary) 5. Delete the message 6. Start execution I think it takes less than 30s (default visibility timeout) between 1 and 5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, looks good.
rescue ActiveRecord::RecordNotUnique => e | ||
@message_queue.delete_message(@message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the following flow,
- worker1:
Barbeque::JobExecution.create
=> success - worker2:
Barbeque::JobExecution.create
for the same message => fail - worker1: immediate shutdown before
Barbeque::ExecutionLog.save_message
- worker2: deletes message in this line
In this case, message is deleted before saving message to S3. Is this intentional? If we delete this message here, we also need store message to S3 before deletion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't consider much about duplication case.
When a message is received by multiple consumers, does the received messages have the same message_id and receipt_handle...? That is, can we omit DeleteMessage when DuplicatedExecution happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's likely that duplicated messages have the same message id because message id is is already created on enqueueing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is, can we omit DeleteMessage when DuplicatedExecution happens?
I think so in normal (successful dequeue worker does not shut down unexpectedly) situation.
But if it shuts down unexpectedly before DeleteMessage, the message would be unnecessarily dequeued multiple times. So it might be better to delete the message here checking it's already stored to S3 (and store it if it's not).
It would be very rare and not so important to handle it, but at least we should avoid deleting message not stored to S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized I had misunderstood about the duplication after reading some SQS documents. I skipped DeleteMessage
a397414
lib/barbeque/message_queue.rb
Outdated
# XXX: #receive_messages returns one message at most because it calls | ||
# receive_message API without max_number_of_messages option. | ||
unless valid_messages.empty? | ||
extend_visibility_timeout(valid_messages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you extend VisibilityTimeout here instead of changing SQS queue's VisibilityTimeout to 60?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When ReceiveMessage returns several messages, Barbeque::MessageQueue#dequeue returns one message and holds other messages in memory. While the returned message has reasonable visibility timeout (SQS queue's visibility timeout), the held messages don't because they are processed in the next barbeque-worker loop.
I think visibility timeout of the held messages should be extended in each loop.
(However, I noticed that valid_messages.size
never becomes more than one 😇 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering such problem and failure possibility on timeout extension, I think receiving multiple messages is not a good idea. So I prefer explicitly specifying MaxNumberOfMessages to 1 (as default though) and using queue's timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I did so 504af15
454121c
to
504af15
Compare
if message.valid? | ||
return message | ||
else | ||
delete_message(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's originally my problem (sorry), it would be better to log or notify this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invalid messages are already logged and notified with ExceptionHandler.
https://github.com/cookpad/barbeque/blob/v0.7.0/lib/barbeque/message.rb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see 🙏
lib/barbeque/message_queue.rb
Outdated
client.delete_message( | ||
queue_url: @job_queue.queue_url, | ||
receipt_handle: message.receipt_handle, | ||
def extend_visibility_timeout(messages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, thanks 5cdb950
lib/barbeque/message_queue.rb
Outdated
@@ -4,6 +4,9 @@ | |||
|
|||
module Barbeque | |||
class MessageQueue | |||
class ExtendVisibilityError < StandardError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement 👍
SQS messages should be available until recoverability is ensured.
We have to care about the visibility timeout.
I think it takes less than 30s (default visibility timeout) between 1
and 5.
#39
@cookpad/dev-infra @k0kubun please review