Skip to content

Defining queued jobs

Marcus edited this page Oct 4, 2016 · 6 revisions

Examples

The best way to learn about defining your own jobs is by checking the examples

  • PublishItemsJob - A job used to publish all the children of a particular node. To create this job, run the PublishItemsTask passing in the parent as a request var (eg ?parent=1)
  • GenerateGoogleSitemapJob - A job used to create a google sitemap. If the googlesitemaps module is installed it will include priority settings as defined there, otherwise just produces a generic structure. To create an initial instance of this job, call dev/tasks/CreateDummyJob?name=GenerateGoogleSitemapJob. This will create the initial job and queue it; once the job has been run once, it will automatically schedule itself to be run again 24 hours later.
  • CreateDummyJob - A very simple skeleton job.

API Overview

The module comes with an AbstractQueuedJob class that defines many of the boilerplate functionality for the job to execute within the framework. An example job can be found in queuedjobs/code/jobs/PublishItemsJob.php.

The key points to be aware of are

  • When defining the constructor for your job, be aware that the QueuedJobService will, when loading the job for execution, create an object of your job type without passing any parameters. Therefore, if you want to pass parameters when initially creating the job, make sure to provide defaults (eg __construct($param=null)), and that their presence is detected before being used. So the base rules for __constructors are
    • they must have default parameters, as the JobService will re-create the job class passing through no constructor params
    • you must have logic in your constructor that can determine if it's been constructed by the job service, or by user-land code, and whether the constructor params are to be used.

The kind of logic to use in your constructor could be something like


public function __construct($to = null) {
    if ($to) {
        // we know that we've been called by user code, so
        // do the real initialisation work
    } 
}

Of course, the other alternative is to set properties on the job directly after constructing it from your own code.

  • Job Properties QueuedJobs inherited from the AbstractQueuedJob have a default mechanism for persisting values via the __set and __get mechanism that stores items in the jobData map, which is serialize()d between executions of the job processing. All you need to do from within your job is call $this->myProperty = 'somevalue';, and it will be automatically persisted for you; HOWEVER, on subsequent creation of the job object (ie, in the __constructor()) these properties have not been loaded, so you cannot rely on them at that point.
  • Special Properties The queuedjobs framework itself expects the following properties to be set by a job to ensure that jobs execute smoothly and can be paused/stopped/restarted. **YOU MUST DEFINE THESE for this to be effectively hooked **
    • totalSteps - the total number of steps in the job
    • currentStep - the current step in the job
    • isComplete - whether the job is complete yet. This MUST be set to true for the job to be removed from the queue
    • messages - an array that contains a list of messages about the job that can be displayed to the user via the CMS
  • Titles Make sure to return a title via getTitle() - this is so that users can be shown what's running in the CMS admin.
  • Job Signatures When a job is added to the job queue, it is assigned a unique key based on the parameters of the job (see AbstractQueuedJob->getSignature()). If a job is already in the queue with the same signature, the new job is NOT queued up; this prevents duplicate jobs being added to a queue, but in some cases that may be the intention. If so, you'll need to override the getSignature() method in your custom job class and make sure to return a unique signature for each instantiation of the job.
  • Job Type You can use either QueuedJob::QUEUED, which will mean the job will run within a minute (due to the cronjob), or QueuedJob::IMMEDIATE, which will execute the job as soon as possible. This forces execution of the job at the end of the currently processing request, OR if you have set QueuedJobService::$use_shutdown_function = false, a monitoring job to trigger the execution of the job queue (see the lsyncd config section). This job type is useful for doing small things (such as deleting a few items at a time, indexing content in a separate search indexer, etc)
  • queueJob() To actually add a job to a queue, you call QueuedJobService->queueJob(Job $job, $startAfter=null). This will add the job to whichever queue is relevant, with whatever 'startAfter' (a date in Y-m-d H:i:s format) to start execution after particular datetime.
  • Switching Users Jobs can be specified to run as a particular user. By default this is the user that created the job, however it can be changed by setting the value returned by setting a user via the RunAs relationship of the job.
  • Job Execution The following sequence occurs at job execution
    • The cronjob looks for jobs that need execution.
    • The job is passed to QueuedJobService->runJob()
    • The user to run as is set into the session
    • The job is initialised. This calls QueuedJob->setup(). Generally, the setup() method should be used to provide some initialisation of the job, in particular figuring out how many total steps will be required to execute (if it's actually possible to determine this). Typically, the setup() method is used to generate a list of IDs of data objects that are going to have some processing done to them, then each call to process() processes just one of these objects. This method makes pausing and resuming jobs later quite a lot easier. It is very important to be aware that this method is called every time a job is 'started' by a cron execution, meaning that any time a job is paused and restarted, this code is executed. Your Job class MUST handle this in its setup() method. In some cases, it won't change what happens because a restarted job should re-perform everything, but in others it might only need to process the remainder of what is left.
    • The QueuedJobService enters a loop that executes until either the job indicates it is finished (the QueuedJob->jobFinished() method returns true), the job is in some way broken, or a user has paused the job via the CMS. This loop repeatedly calls QueuedJob->process() - each time this is called, the job should execute code equivalent of 1 step in the overall process, updating its currentStep value each call, and finally updating the isComplete value if it is actually finished. After each return of process(), the job state is saved so that broken or paused jobs can be picked up again later.

Terminology

The following are some key parts of the system that you should be familiar with

AbstractQueuedJob

A subclass to define your own queued jobs based upon. You don't neeeed to use it, but it takes care of a lot of stuff for you.

QueuedJobService

A service for registering instances of queued jobs

QueuedJobProcessorTask

The task you run to have queued jobs processed. This must be set up to run via cron.

QueuedJobDescriptor

A QueuedJobDescriptor is the stored representation of a piece of work that could take a while to execute, because of which it is desireable to not have it executing in parallel to other jobs.

A queued job should always attempt to report how many potential dataobjects will be affected by being executed; this will determine which queue it is placed within so that some shorter jobs can execute immediately without needing to wait for a potentially long running job.

Note that in future this may/will be adapted to work with the messagequeue module to provide a more distributed approach to solving a very similar problem. The messagequeue module is a lot more generalised than this approach, and while this module solves a specific problem, it may in fact be better working in with the messagequeue module

You can’t perform that action at this time.