Skip to content

Driver for running pre-execute-post operations for a Cartesian product of parameters

License

Notifications You must be signed in to change notification settings

tahandy/chauffeur

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chauffeur

Chauffeur is a simple Python solution to drive multiple executions of a program over a tensor product of parameter combinations.

Chauffeur supports parameterization at all levels of the driver process. Autogenerate directories and input files, dynamically select executables, and perform pre-and post-execution tasks.


#Table of Contents


Requirements


Usage

chauffeur.py [-h] [-i INPUT]

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        YAML input file

Execution flow

In the most general case, chauffeur will attempt to perform the following actions:

  • For each run defined by the tensor product of variables:
    • Copy a template directory to the execution/run directory
    • Process parameterized files
    • Perform pre-execution command
    • Perform execution command
    • Perform post-execution command

Execution of individual runs can be performed in serial, or in parallel across threads. Note this does not imply anything about the parallelization of the underlying executable; your executable command may invoke something like MPI to further parallelize the run. For example, you may execute 3 runs in task-wise parallel, each using a number of MPI cores.

Additionally, chauffeur may be used to setup the directory structures and generate submission files for job management systems like PBS/Torque.


Parameterization

Chauffeur operates by parameterizing most aspects of the automation process. Accessing defined parameters is obtained by enclosing the parameter name in '%(...)'. For example, if var1 is defined (see below), access to this parameter is achieved by using %(var1).

YAML input structure

The input file that drives chauffeur uses the YAML format and is divided into multiple (potentially optional) sections: driver, userdef, file*, and run*. The file and run directives may be specified multiple times, provided each instance is given a unique suffix.

  • driver specifies executables, threads, etc.
  • userdef specifies user-defined parameters for convenience (optional)
  • file* sections define text files which should undergo parameter replacement (optional)
  • run* sections define the parameter space

Inline formatting

Chauffeur supports the use of inline formatting to specify the output format of an evaluated parameter. This inline formatting is based on the equivalent formats for Python 3's format statement. For example, if var1 is an integer, we can format it to print as a width=4 integer with leading zeros using %(var1:04d), where the : denotes the beginning of inline formatting and 04d is the format specifier.

Expressions

Chauffeur includes support for expressions, which are custom combinations of (primarily) numeric parameters. Expressions are enclosed in backticks (`) and are recursively fed to eval. Parameters including expressions must be enclosed in double quotes (").

num: 7
squared: "`pow(%(num),2)`"

Example

A simple input file which echoes parameters is shown below. Additional examples are provided in the examples directory.

Input:

driver:
  execcommand: "echo %(num) %(times10) %(squared)"
userdef:
  times10: "`%(num)*10`"
run:
  variables:
    num: [1,2,3,10,11,37,72]
  parameters:
    squared: "`pow(%(num),2)`"

Output:

1 10 1.0
2 20 4.0
3 30 9.0
10 100 100.0
11 110 121.0
37 370 1369.0
72 720 5184.0

Driver

The driver directive is used to provide overarching parameters to chauffeur.

Modifiable parameters

precommand
Default: None
Description: Set the command to be executed prior to execcommand. Will be executed in taskdir.

execcommand
Default: None
Description: Set the command to be executed. Will be executed in taskdir.

postcommand
Default: None
Description: Set the command to be executed after execcommand. Will be executed in taskdir.

taskdir
Default: %(cwd)
Description: Set the directory a task is executed in. If not task-level parameterized, will reuse the same directory for each task (be careful if threads are used). Default is the directory chauffeur is called from.

templatedir
Default: None
Description: Set the directory used to initialize task directories.

type
Default: exec
Options: exec, setup
Description: Determines how chauffeur is executed. exec performs all operations, including running execcommand. setup only initializes run directories, and also produces PBS submission script.

skipifexist
Default: True
Options: True, False
Description: If true, skips a task if the taskdir assigned to it exists.

nthreads
Default: 1
Description: Sets the number of parallel tasks to execute at once. Separate from execution parallelism.

pbs_submitscript
Default: %(cwd)/pbs_submit.sh
Description: Sets the location of the PBS submission script. This script executes the commands to submit jobs to the scheduler. Job submission script must be handled in file* directives.

pbs_subcommand
Default: qsub
Description: Sets the job scheduler submission command.

nthreads
Default: 1
Description: Sets the number of parallel tasks to execute at once. Separate from execution parallelism.

###Static parameters

cwd
Value: os.getcwd()
Description: Directory where chauffeur is executed in.

scriptdir
Value: os.path.realpath(__file__)
Description: Directory where chauffeur lives.


File*

The file directive is used to specify files which should be processed. This may be used multiple times in the input. These are detected by searching for top-level directives which contain file. When including multiple files, you must append unique suffices to file (e.g. file_1 & file_2).

Modifiable parameters

input
Default: None
Description: Specify input file to be processed. Required if file directive is used.

output
Default: None
Description: Specify resulting output file. Required if file directive is used.

type
Default: None
Options: None, pbs
Description: Specify type of file. If set to pbs, this file will be used as the job scheduler submission script.

parameters
Default: None
Description: Specify additional parameters related to this file. Parameters should be defined as subdirectives

parameters:
  param1: "foo"
  param2: "bar"

Run*

The run directive is used to specify parameter space variables which specify the tasks to execute. Runs are executed in lexacographical order (e.g. run_01 will be performed before run_10). Each task is generated by the tensor product of the variables defined. If the input file contains the run directive

run:
  variables:
    var1: [1,2,3]
    var2: ["foo","bar"]

then the following combinations of parameters will be executed (6 total):

1, "foo"
2, "foo"
3, "foo"
1, "bar"
2, "bar"
3, "bar"

Modifiable parameters

variables
Default: MUST BE DEFINED
Description: Specify the values of the parameter space to be combined. Individual variables should be defined as subdirectives.

variables:
  var1: [1,2,3]
  var2: ["foo","bar"]

variableorder
Default: None
Description: Specify the order that variables should be evaluated in the tensor product. Value should be a list containing the variable names, in order of fastest to slowest varying. By default, variables will be evaluated in lexicographical order.

parameters
Default: None
Description: Specify additional parameters related to this task. Parameters should be defined as subdirectives

parameters:
  param1: "foo"
  param2: "bar"

Userdef

The userdef directive is used to specify parameters which are not directly tied to any task or file. Examples include mathematical expressions that may involve task/file-level parameters, instance-specific identifiers, etc. User defined parameters should be defined as subdirectives.

userdef:
  pi: 3.14159265
  twopi: "`2*%(pi)`"

About

Driver for running pre-execute-post operations for a Cartesian product of parameters

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages