Skip to content
drelu edited this page Jan 27, 2012 · 3 revisions

File Management and Staging

Managing files in distributed systems is a tedious tasks - different paths, names, file versions complicates distributed runs. Since BigJob 0.3.38, BJ includes some basic file staging capabilities. For each bigjob created a directory with the id of the big-job is created:

Big-Job

<BIGJOB_WORKING_DIRECTORY>/bj-54aaba6c-32ec-11e1-a4e5-00264a13ca4c/
<BIGJOB_WORKING_DIRECTORY>/bj-3645d5e8-32ec-11e1-b346-00264a13ca4c/
<BIGJOB_WORKING_DIRECTORY>/bj-398e110a-32e9-11e1-ae24-00264a13ca4c/

Files can be staged to the BJ working directory using the filetransfer parameter of start_pilot_job:

bj_filetransfers = ["ssh://" + os.path.dirname(os.path.abspath(__file__)) 
                        + "/test.txt > BIGJOB_WORK_DIR"]
    
    
bj.start_pilot_job( lrms_url,
                        None,
                        number_of_processes,
                        queue,
                        project,
                        workingdirectory,
                        userproxy,
                        walltime,
                        processes_per_node,
                        bj_filetransfers)

The stdout and stderr of the BJ agent is written to this directory.

Sub-Jobs

For each sub-job a sub-directory is created in the directory of the parent BJ:

<BIGJOB_WORKING_DIRECTORY>/bj-54aaba6c-32ec-11e1-a4e5-00264a13ca4c/sj-55010912-32ec-11e1-a4e5-00264a13ca4c <BIGJOB_WORKING_DIRECTORY>/bj-54aaba6c-32ec-11e1-a4e5-00264a13ca4c/sj-55153072-32ec-11e1-a4e5-00264a13ca4c

By default (i.e. if no working directory is specified in its job description), each sub-job is executed in its sub-job specific directory. If a working directory is specified, the sub-job is specified in this directory.

Files can be staged to the sub-job directory by using the filetransfer attribute:

jd = description()
jd.executable = "/bin/cat"
jd.number_of_processes = "1"
jd.spmd_variation = "single"
jd.arguments = ["test.txt"]
jd.output = "stdout.txt"
jd.error = "stderr.txt"
jd.filetransfer = ["ssh://" + os.path.dirname(os.path.abspath(__file__)) 
                       + "/test.txt > SUBJOB_WORK_DIR"]