Delated_job (or DJ) encapsulates the common pattern of asynchronously executing longer tasks in the background.
It is a direct extraction from Shopify where the job table is responsible for a multitude of core tasks. Amongst those tasks are:
- sending massive newsletters
- image resizing
- http downloads
- updating smart collections
- updating solr, our search server, after product changes
- batch imports
- spam checks
The library evolves around a delayed_jobs table which looks as follows:create_table :delayed_jobs, :force => true do |table| table.integer :priority, :default => 0 # Allows some jobs to jump to the front of the queue table.integer :attempts, :default => 0 # Provides for retries, but still fail eventually. table.text :handler # YAML-encoded string of the object that will do work table.string :last_error # reason for last failure (See Note below) table.datetime :run_at # When to run. Could be Time.now for immediately, or sometime in the future. table.datetime :locked_at # Set when a client is working on this object table.datetime :failed_at # Set when all retries have failed (actually, by default, the record is deleted instead) table.string :locked_by # Who is working on this object (if locked) table.datetime :first_started_at # When first worker picked it up table.datetime :last_started_at # When last worker picked it up (same as first_started_at when no retries) table.datetime :finished_at # Used for statiscics / monitoring table.timestamps end
On failure, the job is scheduled again in 5 seconds + N ** 4, where N is the number of retries.
The default MAX_ATTEMPTS is 25. After this, the job is either deleted (default), or left in the database with “failed_at” set.
With the default of 25 attempts, the last retry will be 20 days later, with the last interval being almost 100 hours.
The default MAX_RUN_TIME is 4.hours. If your job takes longer than that, another computer could pick it up. It’s up to you to
make sure your job doesn’t exceed this time. You should set this to the longest time you think the job could take.
By default, it will delete failed jobs. If you want to keep failed jobs, set
Delayed::Job.destroy_failed_jobs = false. The failed jobs will be marked with non-null failed_at.
Same thing for successful jobs. They’re deleted by default and, to keep them, set
Delayed::Job.destroy_successful_jobs = false. They will be marked with finished_at. This is useful for gathering statistics like how long a job took:
- to be picked up by the first worker: first_started_at – created_at
- in one successful run (the last one that didn’t fail): finished_at – last_started_at
- in failed retries: last_started_at – first_started_at
Here is an example of changing job parameters in Rails:
# config/initializers/delayed_job_config.rb Delayed::Job.destroy_failed_jobs = false Delayed::Job.destroy_successful_jobs = false silence_warnings do Delayed::Job.const_set("MAX_ATTEMPTS", 3) Delayed::Job.const_set("MAX_RUN_TIME", 5.minutes) end
Note: If your error messages are long, consider changing last_error field to a :text instead of a :string (255 character limit).
Jobs are simple ruby objects with a method called perform. Any object which responds to perform can be stuffed into the jobs table.
Job objects are serialized to yaml so that they can later be resurrected by the job runner.
There is also a second way to get jobs in the queue: send_later.BatchImporter.new(Shop.find(1)).send_later(:import_massive_csv, massive_csv)
This will simply create a Delayed::PerformableMethod job in the jobs table which serializes all the parameters you pass to it. There are some special smarts for active record objects
which are stored as their text representation and loaded from the database fresh when the job is actually run later.
Running the jobs
script/generate delayed_job to add
script/delayed_job. This script can then be used to manage a process which will start working off jobs.
- Runs two workers in separate processes.
$ ruby script/delayed_job -e production -n 2 start
$ ruby script/delayed_job -e production stop
You can invoke
rake jobs:work which will start working off jobs. You can cancel the rake task with
Workers can be running on any computer, as long as they have access to the database and their clock is in sync. You can even
run multiple workers on per computer, but you must give each one a unique name. (TODO: put in an example)
Keep in mind that each worker will check the database at least every 5 seconds.
Note: The rake task will exit if the database has any network connectivity problems.
You can invoke
rake jobs:clear to delete all jobs in the queue. There are also the self-explanatory
rake jobs:clear:finished and
- 1.8.1 Restructured the initial migration template, added fields for tracking jobs in queue, and defaulted the queue over to not purge completed jobs. Much of this code came from Helder’s branch.
- 1.8.0: Version number was bumped by collectiveidea’s fork. TODO Document specifically what’s different.
- 1.7.0: Added failed_at column which can optionally be set after a certain amount of failed job attempts. By default failed job attempts are destroyed after about a month.
- 1.6.0: Renamed locked_until to locked_at. We now store when we start a given job instead of how long it will be locked by the worker. This allows us to get a reading on how long a job took to execute.
- 1.5.0: Job runners can now be run in parallel. Two new database columns are needed: locked_until and locked_by. This allows us to use pessimistic locking instead of relying on row level locks. This enables us to run as many worker processes as we need to speed up queue processing.
- 1.2.0: Added #send_later to Object for simpler job creation
- 1.0.0: Initial release
Why Is This Fork Different?
Patrick composed this of collectiveidea’s fork and helder’s fork because collective idea was more feature complete and helder had additional features he requires.