New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow listening on multiple queues #71
Comments
I've also ran into cases where multiple queues would be nice. ActionMailer's deliver_later puts stuff into a 'mailer' queue by default? |
Yes, though it's |
eww, yes, that's going to confuse a ton of people. |
OK, so in
and then in my Que.mode = :async
# important the worker setup comes after the mode change!
Que::Worker.workers.push Que::Worker.new 'mailers'
Que::Worker.workers.push Que::Worker.new 'default'
Que::Worker.workers.push Que::Worker.new 'default' Running the test app and putting something in both queues led to both queues being emptied 🎉 However the wakeup / wrangler stuff looks like it assumes workers are going to be looking at the same queue so maybe this will complicate things to much. There's a workable workround above and perhaps wanting something more complicated is a sign you need to go to either external workers or start looking at some of the other queuing solutions. |
Just noticed something odd with the setup above. If I add a job to the |
The usage of the "mailers" queue isn't configurable? That sounds like something the ActiveJob adapter should handle, if possible. The named queues functionality in Que was really only meant for the case where different codebases need to access the same queue - in my experience, other systems use named queues mainly to implement priorities, but we have those built in. So I was hesitant to add them in the first place (and I regret it a bit now - it may have been preferable to support multiple job tables for that use case), and I'm hesitant to encourage further use of them. But I suppose it's always better to be more flexible and support more use cases. Anyway, I don't think it'd be a good idea to let one worker pull items from multiple queues, since I think it'd complicate the job lock query by quite a bit, relying as it does on the ordering of the (priority, run_at, job_id) trio. I haven't benchmarked it, but I expect it'd be very bad for performance. We could prepend the queue name to that list, but then workers would always work through their queues in alphabetical order, which would probably be unexpected and unhelpful. Supporting multiple worker pools would be more performant, and would probably be preferable in terms of cleaning up the codebase anyway. I can work on it this weekend. A quick workaround for your current problem would be something like: # In an initializer:
worker = Que::Worker.new("mailers")
Thread.new { loop { sleep(5); worker.wake! } }
at_exit { worker.stop.wait_until_stopped } |
@chanks Thanks for the workround, I will have a play with it later. I really like how focused Que is, it's the perfect queueing system for an app in development on Heroku and so I don't know if making it more complicated to support more use cases is best in the long run. If it wasn't for Rails/ActiveJob using different queue names then I probably wouldn't have been asking about it. It's up to you of course! :) Would telling Que to ignore queue names be workable? I.e. just pretend they don't exist so you don't run into the problem? And if you do need different workers for different queues you could run different rake tasks configured through environment variables. |
Thanks! I do think it makes sense to make worker pools their own class from a modularity and cleanliness perspective. I think it's somewhat similar to the different approaches ActiveRecord and Sequel take to databases. We have this worker pooling functionality crammed into the Que::Worker class, similar to the database logic that's in ActiveRecord::Base, whereas in Sequel databases are just objects that can be instantiated, used, and discarded if necessary. Even though most apps never connect to more than one database, I think it's still a nicer, cleaner model. Having a configuration option to ignore passed queue names and always insert the empty string would probably work, but does feel very hacky. That would be a more appropriate solution (even a default) for the ActiveJob adapter, maybe. |
Alright, so now there's a work in progress in the worker_pool branch. It extracts all the pool logic from the Worker class to a separate WorkerPool class. Each WorkerPool has its own mode, worker_count, wake_interval, and so on, and those configuration methods on the Que module now just change those values for the default worker_pool (which works jobs in ENV['QUE_QUEUE'] || ""). Not merging this to master yet because I'm still unsure about what the API should be for configuring multiple pools (what chrismcg suggested covers worker_count but not the other options), or if there even should be one, since having multiple worker pools in the same process is still not the recommended way to use Que. However, with this branch it should be easier to make that work if you need to: my_pool = Que::WorkerPool.new("my_queue_name")
my_pool.mode = :async
my_pool.worker_count = 8
my_pool.wake_interval = 1
at_exit { my_pool.mode = :off } Anyway, comments and suggestions are welcome. |
Cool! I'll have a look at this at the weekend most likely. Thanks! |
I haven't had a chance to play with this again yet unfortunately. I did put together a Rails PR to allow configuring the ActionMailer default queue name. |
+1 for keeping queue names and allowing workers to process multiple queues, including :all A couple of thoughts.
After looking at the query code, I understand why allowing for workers that can process any job(s) would involve a little bit of work, but I think this is an important feature set, to the point of relegating que to be a very specialized high-scale library only, and not even usable in simpler, smaller scale application deployments, in favor of delayed_job or something that has this built it. |
What is the |
@marshall-lee Currently, in this branch, each worker pool has its own mode, and Que.mode= sets the value on the "default" worker pool, which works whatever queue is in ENV["QUE_QUEUE"], or the default queue if that's not set. :sync mode doesn't mean anything on the non-default pools, and if I keep going with this branch I'd likely shuffle the API around to make a bit more sense. :async or :off basically turns them on and off. @drush The main problem with allowing some workers to handle any job is that it requires the creation of a second index, on [:priority, :run_at, :job_id]. Having a second index will slow down writes for everyone, whether they need to use it or not, and I don't really want to go down the road of having the db schema be dependent on configuration options. Sorry, I'm actually not clear from your message what functionality you'd like to get out of named queues that you can't get from Que currently, especially since named queues weren't meant to be used so frequently (they were intended to only be used when distinct codebases need to access the same job queue, and I wish ActiveJob weren't trying to use them for other things). You can get stats on what types of jobs are common with a count grouped by job_class. If you want jobs to be processable by any worker and to not have to reprovision when a new job class is added, then you probably shouldn't be using named queues at all. |
@chanks It looks like ActiveJob registers all job_class as 'ActiveJob::QueueAdapters::QueAdapter::JobWrapper' So, if you are using ActiveJob in a standard configuration with que, the job_class won't be of any use. However, that's not the use case I think is most relevant. I might have 20 different mailer classes with esoteric class names, but I just want to see them as a group without having to issue a LIKE query of some kind to group them. I still want to use named queues for monitoring job abcklog, and I want workers that can work any job. I don't think this is unreasonable, in fact, I think this is the common case amongst any application using background jobs. |
Well, one more reason to not use ActiveJob. Anyway, there's still not a simple and performant way (that I'm aware of) to support a single worker pool working many named queues. And besides, if you want to nicely categorize your backlog of jobs by their class, you can easily do that with a CASE statement in SQL or some very simple logic in Ruby. |
Any update on this? Sounds like this means Que + ActiveJob isn't a recommended combination? |
I submitted a small PR to ActiveJob to have it stop using named queues (rails/rails#19498). It got a positive response but hasn't been merged yet. In the meantime, no, I wouldn't recommend using ActiveJob. Sorry 😦 |
Thanks for the quick response. I'll keep an eye out for updates. |
Looks like your Rails PR was merged! Nice. |
👍 On Mon, May 4, 2015 at 7:24 AM, John Nagro notifications@github.com wrote:
|
any ideas on how I might monkey patch this change into my existing Rails 4.2 project? I've tried overriding the class in an initializer without luck. |
for the moment, i am just running everything out of the "mailers" queue 🙈 |
Hmm, I don't know offhand why that would be. But then, I'm not using Rails 4.2 for anything and I don't know anything about ActiveJob beyond what was necessary to submit the PR. If you figure it out please let us know. Or in the meantime maybe use edge Rails? |
Question on the discussion here, though not the bug in particular: I don't use named queues for priorities. I use them because I have some really long-running jobs (I'm building a text-analysis system, and jobs can take on the scale of hours or days), and some short-running jobs (like sending e-mails and performing app maintenance tasks). I have to make sure that it won't ever be the case that all the workers totally bind up serving long-running jobs and nobody's around to service short-term jobs. Will there be any way for me to put together a system like that when the named queues are gone? Just thought you might like to see a different use case. |
For those of you looking for a Rails 4.2 monkey patch, you can use the following: module ActiveJob
module QueueAdapters
class QueAdapter
def self.enqueue(job)
JobWrapper.enqueue job.serialize
end
def self.enqueue_at(job, timestamp)
JobWrapper.enqueue job.serialize, run_at: Time.at(timestamp)
end
end
end
end I put this in config/initializers/que.rb, which is also where I set Que.mode, configure the log_formatter, and set |
In addition to the case described by @cpence, I have another valid (I believe) case for full named queue support as per other queuing systems. That is, sometimes different jobs have different resource requirements (high memory or CPU). I have different "classes" of worker running to handle different jobs right now (in a different job system than Que). So in that case, I'd ideally to have the smaller classes subscribe to all queues except the big queues but have the big queues subscribe to certain queues. Or, have certain queues weighted higher so that it can still subscribe to the smaller queues if it is otherwise inactive, though that's less important to me. |
Definitely agree there are times when multiple queue support would be useful. We have 12 workers, 6 of which do very long running We would like to set up Que to say:
I do see you point about having different CPU/memory requirements etc however (or even different credentials). In our case we can get away with them all being the same. |
Yeah, it's supported there. It still requires multiple polling queries, so it's less efficient than using a single queue and separating via numeric priority (for the polling case, for the listening case it shouldn't make a difference). Also, when polling, the queues execute in the order they're declared, so running @hlascelles' case of having some workers always set aside capacity for |
Great. Thanks for the update @chanks. Totally get the priority usage in your example. As mentioned in #186 my use cases are much more about separating work by semantics, not priority, such as scaling up one group of jobs without risking over-scaling another type which could overwhelm upstream services, or having some jobs run on higher-specced worker instances, which can also pull jobs from main transactional queue if otherwise idle. |
No reason to leave this open! |
I've been playing with Que and the ActiveJob wrapper in Rails 4.2, looks really useful so thanks to everyone who contributed so far.
It took me a while to realise that an apps Que workers will only listen to one queue at a time, defaulting to ''. Once I'd figured out I could set
QUE_QUEUE
environment variable I made some progress. However I ran into trouble again when using the newdeliver_later
functionality in ActionMailer. The job it creates puts mails on themailers
queue. I've worked around this in my test app by putting all my jobs on themailers
queue but it's not a great solution.I can think of two solutions to this.
The simpler one would be to let
QUE_QUEUE
potentially be a list of queues separated by commas and tweak the:lock_job
sql to useWHERE queue in ...
A more complicated (but more flexible) solution would be to somehow let the app override set_up_workers and create it's own workers listening to whatever queues were required. This may already be possible by setting
worker_count
to 0 and doingQue::Worker.workers.push Que::Worker.new('foo')
as required. I'll test this out after I've finished writing this ticket.Would you object to a PR allowing something like the following?
The text was updated successfully, but these errors were encountered: