Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use heartbeat timer to keep message invisible in slow workers #24

Closed
wants to merge 8 commits into from

Conversation

leemhenson
Copy link
Contributor

(this PR depends on #18 and includes all of it's commits)

As I mentioned in #13 there is general need for queue consumers to handle the issue of long running workers exceeding the message visibility timeout on the queue they are consuming messages from. This PR uses a simple timer to extend the timeout every 5 seconds by default.

I've used Celluloid 0.16 because it allows the user of Timers 4.0.1, rather than Timers 1.1 which is used by Celluloid 0.15.2. There is a memory leak fixed in Timers 4 which we'd want. See: https://github.com/celluloid/timers/blob/master/CHANGES.md

I haven't yet put this code through it's paces, but I thought I'd at least get something down 'on paper' for discussion. The specs pass, at least. 🍖

@elsurudo
Copy link
Contributor

@leemhenson This is a great feature that I want myself. However, there is an issue with Celluloid 0.16 causing hangs during shutdown. See: celluloid/celluloid#466

This is why I created this PR to knock the version down originally: #11

It seems Celluloid 0.16 has some other issues as well: celluloid/celluloid#457

Is it impossible or problematic to implement this with Celluloid 0.15.2?

@leemhenson
Copy link
Contributor Author

Well, the specs still pass with 15.2. I'm still trying to integrate shoryuken into an app here in order to test out these changes, so I can't confirm in a real-world scenario yet.

@phstc
Copy link
Collaborator

phstc commented Dec 20, 2014

The PR #25 by @petergoldstein should fix the shutdown problem with 0.16

queues(queue).receive_message(Hash(options))
end

def send_message(queue, body, options = {})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This send_message allows Client.send_message(hash) instead of Client.send_message(hash.to_json).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm found it on lib/shoryuken/queue.rb

@phstc
Copy link
Collaborator

phstc commented Dec 31, 2014

Hey @leemhenson

I took the liberty to cherry-pick only the auto_visibility_timeout change, so I have more time to review the aws-sdk-core changes on #26. I hope you don't mind ❤️

After cherry-picking it, I did some minor changes and I would love to hear your opinion @leemhenson @elsurudo

Now it's worker based:

class MyWorker
  include Shoryuken::Worker

  shoryuken_options queue: 'my_queue', auto_visibility_timeout: true
end

and default false to keep the transparency with SQS.

Docs: https://github.com/phstc/shoryuken/wiki/Worker-options#auto_visibility_timeout

The heartbeat now is the queue visibility_timeout minus 5 seconds, just to avoid multiple requests for people having higher visibility_timeout than 5 seconds (which I 🙏 to be most people).

And the new visibility timeout will be set to the default queue visibility timeout. It must be a higher value than the heartbeat (which is guaranteed by visibility_timeout - 5), otherwise the message can become available again before the Celluloid::Actor#every call.

What do you think? Do these changes make sense?

Related code:

https://github.com/phstc/shoryuken/blob/master/lib/shoryuken/processor.rb#L34-L48
https://github.com/phstc/shoryuken/blob/master/lib/shoryuken/worker.rb#L44-L54

@phstc phstc closed this Dec 31, 2014
@leemhenson
Copy link
Contributor Author

Yeah looks good to me. 🍹

@phstc phstc mentioned this pull request Dec 31, 2014
@elsurudo
Copy link
Contributor

elsurudo commented Jan 2, 2015

Looks good to me too. This is great, as it solves the edge condition with one of my jobs that doesn't have an "upper bound" visibility timeout I can reliably guess 👍

@leemhenson leemhenson deleted the heartbeat branch January 8, 2015 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants