Skip to content

muir/Proc-JobQueue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NAME
     Proc::JobQueue - job queue with dependencies, base class

SYNOPSIS
     use Proc::JobQueue;

     $queue = Proc::JobQueue->new(%parameters);

     $queue->addhost($host, %parameters);

     $queue->add($job);
     $queue->add($job, $host);

     $queue->startmore();

     $queue->hold($new_value);

     $queue->checkjobs();

     $queue->jobdone($job, $do_startmore, @exit_code);

     $queue->alldone()

     $queue->status()

     $queue->startjob($host, $jobnum, $job);

DESCRIPTION
    Generic queue of "jobs". Most likely to be subclassed for different
    situations. Jobs are registered. Hosts are registered. Jobs may or may
    not be tied to particular hosts. Jobs are started on hosts. Jobs may or
    may not have dependencies on each other.

    Proc::JobQueue does not start jobs on its own: it needs something to
    call "startmore()" every now and then. Two subsclasses provide this
    complete Proc::JobQueue: Proc::JobQueue::EventQueue which provides an
    event-based framework using IO::Event and
    Proc::JobQueue::BackgroundQueue which provides a simple
    loop-until-all-the-jobs-are-done construct.

    From the jobs point of view, it will be started with:

      $job->jobnum($jobnum);
      $jobnum = $job->jobnum();
      $job->queue($queue);
      $job->host($host);
      $job->start();

    When jobs complete, they must call:

      $queue->jobdone($job, $do_startmore, @exit_code);

    Jobs are run on hosts which must be added with:

      $queue->addhost($hostname, jobs_per_host => $number_to_run_on_this_host_at_one_time)

    Jobs can be shell commands (Proc::JobQueue::Command), a sequence of
    other jobs (Proc::JobQueue::Sequence), some standard file operations
    (Proc::JobQueue::Move, Proc::JobQueue::Sort), custom cubclasses of the
    base job class (Proc::JobQueue::Job), arbitrary perl code
    (Proc::JobQueue::DependencyJob, Proc::JobQueue::Task), or arbitary perl
    code pushed to a remote system to run
    (Proc::JobQueue::RemoteDependencyJob).

CONSTRUCTION
    The parameters for "new" are:

    jobs_per_host (default: 4)
        Default number of jobs to run on each host simultaneously. This can
        be overridden on a per-host basis.

    host_overload (default: 120)
        If any one host has more than this many jobs waiting for it, no
        can-run-on-any-host jobs will be started. This is to prevent the
        queue for this one overloaded host from getting too large.

    jobnum (default: 1000)
        This is the starting job number. Job numbers are sometimes
        displayed. They increment for each new job.

    hold_all (default: 0)
        If true, prevent any jobs from starting until "$queue->hold(0)" is
        called.

    dependency_graph (default undef)
        A dependency graph to track jobs and tasks that have dependencies
        and are not yet ready to run because of their dependencies.

METHODS
    configure(%params)
        Adjusts the same parameters that can be set with "new".

    addhost($hostname, %params)
        Register a new host. Parameters are:

        jobs_per_host
            The number of jobs that can be run at once on this host. This
            defaults to the "jobs_per_host" parameter of the $queue.

    add($job, $host)
        Add a job object to the runnable queue. The job object must be a
        Proc::JobQueue::Job or subclass of Proc::JobQueue::Job. The $host
        parameter is optional: if not set, the job can be run on any host.

        The $job object is started with:

          $job->jobnum($jobnum);
          $jobnum = $job->jobnum();
          $job->queue($queue);
          $job->host($host);
          $job->start();

        When the job complets, it must call:

          $queue->jobdone($job, $do_startmore, @exit_code);

        Jobs added this way must be ready to run with no dependencies on
        other jobs. Jobs and tasks that have dependencies should be added
        with:

          $queue->graph->add($job);

    graph([Object::Dependency->new()])
        Get or set the dependency graph used to track jobs and tasks that
        have dependencies. The dependency graph is an Object::Dependency
        object (or at least something that implements the same API). Items
        in the dependency graph are not in the runnable queue. They will be
        moved to the runnable queue when they do not have any un-met
        dependencies.

    jobdone($job, $do_startmore, @exit_code)
        When jobs complete, they must call jobdone. If $do_startmore is
        true, then "startmore()" will be called. A true exit code signals an
        error and it is used by Proc::JobQueue::CommandQueue.

    job_part_finished($job)
        This marks the $job as complete and a new job can start in its
        place. For Proc::JobQueue::DependencyJob jobs, this leaves the
        dependency in place.

    alldone
        This checks the job queue. It returns true if all jobs have
        completed and the queue is empty.

    status
        This prints a queue status to STDERR showing what's running on which
        hosts. Printing is supressed unless
        $Proc::JobQueue::status_frequency seconds have passed since the last
        call to "status()".

    startmore
        This will start more jobs if possible. The return value is true if
        there are no more jobs to start.

    hold($new_value)
        Get (or set if $new_value is defined) the queue's hold-all-jobs
        parameter. If hold-all-jobs is true, no jobs will be started or
        pulled out of the dependency graph (if there is one).

INTERNAL METHODS
    These methods may be needed by subclassers or anyone poking around the
    internals:

    checkjobs
        Check Proc::Background style jobs to see if any have finished.

    startjob($host, $jobnum, $job)
        This starts a single job. It is used by startmore() and probably
        should not be used otherwise.

    suicide
        Called to shut down. Used by Proc::JobQueue::EventQueue.

CANONICAL HOSTNAMES
    Proc::JobQueue needs canonical hostnames. It gets them by default with
    Proc::JobQueue::CanonicalHostnames. You can override this default by
    overriding $Proc::JobQueue::host_canonicalizer with the name of a perl
    module to use instead of Proc::JobQueue::CanonicalHostnames.

    Helper functions are provided by Proc::JobQueue and are available via
    explicit import:

     use Proc::JobQueue qw(my_hostname canonicalize is_remote_host);

SEE ALSO
    Proc::JobQueue::Job Proc::JobQueue::Command
    Proc::JobQueue::DependencyJob Proc::JobQueue::RemoteDependencyJob
    Proc::JobQueue::EventQueue Proc::JobQueue::BackgroundQueue

LICENSE
    Copyright (C) 2007-2008 SearchMe, Inc. Copyright (C) 2008-2010 David
    Sharnoff. Copyright (C) 2011 Google, Inc. This package may be used and
    redistributed under the terms of either the Artistic 2.0 or LGPL 2.1
    license.

Releases

No releases published

Packages

No packages published

Languages