Skip to content
Warren Kretzschmar edited this page Jun 3, 2015 · 1 revision

NAME

DM - Distributed Make: A perl module for running pipelines

VERSION

version 0.018

SYNOPSIS

use DM 0.002001;

# create a DM object

my $dm = DM->new( dryRun => 0, numJobs => 5 )

# add rule with target, prerequisite, and command to use to update the target

$dm->addRule( 'targetFile', 'prerequisiteFile', 'touch targetFile' );

# add more rules ...

# executed the pipeline

$dm->execute();

DESCRIPTION

DM is a perl module for running pipelines. DM is based on GNU make. Currently, DM supports running on a single computer or an SGE managed cluster.

GOOD PRACTICE

  • Never make a directory a dependency. DM creates directories as it needs them.
  • Never create rules that delete files. Delete files by hand instead. Chances are, You will be sorry otherwise.
  • make runs in dryRun mode by default (this is for your own safety!). Pass in 'dryRun => 0' to new() to run.

OPTIONS

Any of the following options can be passed in to a call to new() in order to change the defaults on how make is run by DM. The default value is listed behind the option name.

GNU make specific options

  • dryRun 1

    show what will be run, but don't actually run anything. Corresponds to -n option in GNU make.

  • numJobs 1

    maximum number of jobs to run, or "" for maximum concurrency permitted by dependencies. Applicable to queue and non-queue situations. Corresponds to -j option in GNU make.

  • keepGoing 0

    if any job returns a non-zero exit status, the default behaviour is not to submit any further jobs and wait for the others to finish. If this option is true, then any jobs that do not depend on the failed job(s) will still be submitted. Corresponds to -k option in GNU make.

  • alwaysMake 0

    Unconditionally make all targets. Corresponds to -B option in GNU make.

  • touch 0

    If true, touch all targets such that make will think that all files have been made successfully. This is only partially supported, as touch will not create any prerequisite directories. Corresponds to -t option in GNU make.

  • ignoreErrors 0

    Corresponds to -i option in GNU make.

  • outputFile [distributedmake.log]

    Log file

SGE specific options

These options are passed to qsub for submitting jobs on an SGE cluster

  • engineName undef

    Type of engine (localhost, SGE, PBS, LSF). Is detected automagically by DM.

  • queue undef

    Corresponds to -q.

  • projectName undef

    Corresponds to -P.

  • PE { name => undef, range => undef }

    Anonymous hash reference. Corresponds to -pe option

  • name

    Corresponds to -N option.

other options

  • globalTmpDir undef

    Directory for storing temporary files that can be accessed by every node on the cluster (usually not /tmp)

  • tmpdir /tmp

    Directory for storing temporary files.

GENERAL FUNCTIONS

new()

Returns a DM object. Options (see the Options section) can be passed to new() as key value pairs.

  • Required Arguments

    none

  • Returns

    DM object

addRule()

This function creates a basic dependency between a prerequisite, target and command. The prerequisite is a file that is required to exist in order to create the target file. The command is used to create the target file from the prerequisite(s).

Alias

ar()

  • Required Arguments

    - target file

    - prerequisite file(s)

    - command

  • Returns

    none

execute()

This method is called after all rules have been defined in order to write the make file and execute it. No mandatory options. Takes only overrides.

  • Required Arguments

    none

  • Returns

    exits status. 0 means no problems.

JOB ARRAY FUNCTIONS

Workflow

First, initialize a job array with startJobArray(). Add rules to the job array with addJobArrayRule(). Last, call endJobArray() to signal that no more rules will be added to this particular job array. Multiple job arrays can be defined after each other in this manner. execute() can only be called if the most recently started job array has been completed with endJobArray.

On SGE, the job array will only be started once the prerequisites of all job array rules have been updated. On other platforms, each job will start once its prerequisite has been updated. However, on all platforms, the job array target will only be updated once all rules of that job array have completed successfully.

Only the target specified in startJobArray() should be used as a prerequisite for other rules. The targets specified through addJobArrayRule() should never be used as prerequisites for other rules.

startJobArray()

daes nothing unless 'cluster' eq 'SGE'. Requires 'target' to be specified as key value pairs: startJobArray(target=>$mytarget)

Add in overrides at this point. They will be applied at endJobArray().

Alias

sja()

addJobArrayRule()

This structure is designed to work with SGE's job array functionality. Any rules added to a jobArray structure will be treated as simple add rules when running on localhost, LSF or PBS, but will be executed as a jobArray on SGE.

Alias

ajar()

Required Arguments

takes three inputs: target, prereqs, command as key value pairs:

addJobArrayRule( target => $mytarget, prereqs => \@myprereqs, command => $mycommand );

or as a list addJobArrayRule( $mytarget, \@myprereqs, $mycommand ); prereqs may also be a scalar (string).

The target is only for internal updating by the job array. The target may not be used as a prerequisite for another rule. Use the job array target instead.

Returns

none

endJobArray()

Adds the rule that kicks off the job array. See Workflow for further description.

Alias

eja()

Requried Arguments

none

Returns

The target of the job array

BUGS

Please report any bugs or feature requests to bug-dm at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DM. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

AUTHOR

Kiran V Garimella kiran@well.ox.ac.uk and Warren W. Kretzschmar warren.kretzschmar@well.ox.ac.uk

COPYRIGHT AND LICENSE

This software is copyright (c) 2015 by Kiran V Garimella and Warren Kretzschmar.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.