Why Backburner

Merouane Atig edited this page May 31, 2016 · 47 revisions

Backburner is well tested and has a familiar, no-nonsense approach to job processing but that is of secondary importance. Let's face it; there are a lot of options for background job processing. DelayedJob, sidekiq and Resque come to mind as a few of the popular projects in this space. So, how do we make sense of which one to use? And why use Backburner over other popular alternatives?

The key to understanding the differences in these projects lies in researching the different projects and protocols that power these popular queue libraries under the hood. Every job queue requires a persistence store that jobs are put into and then later pulled out of. In the case of Resque, jobs are processed through Redis, a persistent key-value store. In the case of DelayedJob, jobs are processed through ActiveRecord and a database such as PostgreSQL.

One measure of the power of a job persistence store is how much code is needed within ruby to produce the desired job processing features. For example, redis does not support many queue operations out of the box (i.e scheduling) so to achieve the job queue, resque has ~3091 lines of source code plus ~1100 in resque-scheduler if you want to schedule jobs. Similarly, ActiveRecord was not built to support job queues and DelayedJob has ~2000 lines of source code and requires constant polling of the underlying database. In contrast, Backburner has ~1000 lines of source code. That all said, lines of code is actually a terrible metric to learn anything useful about a project, we are better off digging deeper into the underlying persistence stores.

Evaluating Persistence Stores

The persistence mechanism underlying these gems tells you infinitely more about the important differences then anything else. Criteria you might want to evaluate for a queue include memory efficiency, throughput performance, reliability, priority support, delayed scheduling support, worker scalability, error handling and also notice how much of the complicated job processing logic is baked into the persistence layer or manually added into the ruby project code.

Beanstalk is probably the best solution for job queues available today for many reasons and in most important criteria excels over other job persistence solutions. The real question then is... "Why Beanstalk?".

Quick Comparison

Here's a quick chart for comparison:

Project Persistence Processing Throughput LoC Scheduling Named Queues Retry Priority
Backburner Beanstalkd Any 5200/sec ~1070 Y Y Y Y
Resque Redis Forking 300/sec ~3100 Y* Y Y* Y*
Sidekiq Redis Multi-Thread 800/sec ~2850 Y Y Y Y
DelayedJob Database Single-Thread 120/sec ~2000 Y Y Y Y

(*) Requires separate plugin code or incomplete

Lines of code determined using very naively using git ls-files | grep lib | xargs cat | wc -l command to determine number of lines of code in the lib folder.

Job throughput based on Adam's Queue Benchmark. Don't take the numbers too seriously. The benchmark is outdated and not rigorous. Let me know if any other criteria should be added / updated or other libraries added.

"Processing" is comparing the strategy the worker uses tom process jobs out of the box. Resque uses a 'forking' worker, sidekiq uses a 'threaded' one, but backburner supports either as well as a "best of both worlds" threads on forks approach.