Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete message after saving message body to S3 #43

Merged
merged 4 commits into from
Jul 21, 2017

Conversation

eagletmt
Copy link
Member

SQS messages should be available until recoverability is ensured.
We have to care about the visibility timeout.

  1. Receive a message from MessageQueue
  2. Dispatch the message to MessageHandler::XXX
  3. Create a record for execution
  4. Save the message body to S3 (if necessary)
  5. Delete the message
  6. Start execution

I think it takes less than 30s (default visibility timeout) between 1
and 5.


#39
@cookpad/dev-infra @k0kubun please review

SQS messages should be available until recoverability is ensured.
We have to care about the visibility timeout.

1. Receive a message from MessageQueue
2. Dispatch the message to MessageHandler::XXX
3. Create a record for execution
4. Save the message body to S3 (if necessary)
5. Delete the message
6. Start execution

I think it takes less than 30s (default visibility timeout) between 1
and 5.
Copy link

@taiki45 taiki45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, looks good.

rescue ActiveRecord::RecordNotUnique => e
@message_queue.delete_message(@message)
Copy link
Collaborator

@k0kubun k0kubun Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the following flow,

  • worker1: Barbeque::JobExecution.create => success
  • worker2: Barbeque::JobExecution.create for the same message => fail
  • worker1: immediate shutdown before Barbeque::ExecutionLog.save_message
  • worker2: deletes message in this line

In this case, message is deleted before saving message to S3. Is this intentional? If we delete this message here, we also need store message to S3 before deletion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't consider much about duplication case.
When a message is received by multiple consumers, does the received messages have the same message_id and receipt_handle...? That is, can we omit DeleteMessage when DuplicatedExecution happens?

Copy link
Collaborator

@k0kubun k0kubun Jul 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's likely that duplicated messages have the same message id because message id is is already created on enqueueing.

Copy link
Collaborator

@k0kubun k0kubun Jul 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, can we omit DeleteMessage when DuplicatedExecution happens?

I think so in normal (successful dequeue worker does not shut down unexpectedly) situation.

But if it shuts down unexpectedly before DeleteMessage, the message would be unnecessarily dequeued multiple times. So it might be better to delete the message here checking it's already stored to S3 (and store it if it's not).

It would be very rare and not so important to handle it, but at least we should avoid deleting message not stored to S3.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized I had misunderstood about the duplication after reading some SQS documents. I skipped DeleteMessage
a397414

# XXX: #receive_messages returns one message at most because it calls
# receive_message API without max_number_of_messages option.
unless valid_messages.empty?
extend_visibility_timeout(valid_messages)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you extend VisibilityTimeout here instead of changing SQS queue's VisibilityTimeout to 60?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When ReceiveMessage returns several messages, Barbeque::MessageQueue#dequeue returns one message and holds other messages in memory. While the returned message has reasonable visibility timeout (SQS queue's visibility timeout), the held messages don't because they are processed in the next barbeque-worker loop.
I think visibility timeout of the held messages should be extended in each loop.
(However, I noticed that valid_messages.size never becomes more than one 😇 )

Copy link
Collaborator

@k0kubun k0kubun Jul 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering such problem and failure possibility on timeout extension, I think receiving multiple messages is not a good idea. So I prefer explicitly specifying MaxNumberOfMessages to 1 (as default though) and using queue's timeout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I did so 504af15

if message.valid?
return message
else
delete_message(message)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's originally my problem (sorry), it would be better to log or notify this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid messages are already logged and notified with ExceptionHandler.
https://github.com/cookpad/barbeque/blob/v0.7.0/lib/barbeque/message.rb

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see 🙏

client.delete_message(
queue_url: @job_queue.queue_url,
receipt_handle: message.receipt_handle,
def extend_visibility_timeout(messages)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, thanks 5cdb950

@@ -4,6 +4,9 @@

module Barbeque
class MessageQueue
class ExtendVisibilityError < StandardError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused constant

Copy link
Collaborator

@k0kubun k0kubun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement 👍

@eagletmt eagletmt merged commit fab5bfe into cookpad:master Jul 21, 2017
@eagletmt eagletmt deleted the save-message-before-delete branch July 21, 2017 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants