<h1>3.7. Staging Task Output Data</h1>

Upon completion, tasks often create some amount of data. We have seen in Obtaining Task Details how we can inspect the task’s stdout string, but that will not be useful beyond the most trivial workloads. This section shows how to stage the output data of tasks back to the RP application, and/or to arbitrary storage locations and devices.

In principle, output staging is specified as the input staging discussed in the previous section:

 - source: what files need to be staged from the context of the task that terminated execution
 - target: where should the files be staged to
 - action: how should files be staged.
 
Note that in this example we specify the output file name to be changed to a unique name during staging:

In [None]:
for i in range(0, n):
    cud.executable     = '/bin/cp'
    cud.arguments      = ['-v', 'input.dat', 'output.dat']
    cud.input_staging  = ['input.dat']
    cud.output_staging = {'source': 'output.dat',
                          'target': 'output_%03d.dat' % i,
                          'action': rp.TRANSFER}

<h2>3.7.1. Running the Example</h2>

Below is an example application which uses the code block above.

We start by importing the radical.pilot module and initializing the reporter facility used for printing well formatted runtime and progress information.

In [None]:
import os
import sys

verbose  = os.environ.get('RADICAL_PILOT_VERBOSE', 'REPORT')
os.environ['RADICAL_PILOT_VERBOSE'] = verbose

import radical.pilot as rp
import radical.utils as ru

report = ru.Reporter(name='radical.pilot')
report.title('Getting Started (RP version %s)' % rp.version)

We will now import the dotenv module for fetching our environment variables. To create a new Session, you need to provide the URL of a MongoDB server which we will fetch from our .env file.

We will set the resource value to 'local.localhost'. Using a resource key other than local.localhost implicitly tells RADICAL-Pilot that it is targeting a remote resource.

In [None]:
from dotenv import load_dotenv
load_dotenv()

RADICAL_PILOT_DBURL = os.getenv("RADICAL_PILOT_DBURL")
os.environ['RADICAL_PILOT_DBURL'] = RADICAL_PILOT_DBURL
resource = 'local.localhost'
session = rp.Session()

All other pilot code is now tried/excepted. If an exception is caught, we can rely on the session object to exist and be valid, and we can thus tear the whole RP stack down via a <i>'session.close()'</i> call in the <i>'finally'</i> clause.

In [None]:
 try:

    # read the config used for resource details
    report.info('read config')
    config = ru.read_json('/../config.json')
    report.ok('>>ok\n')

    report.header('submit pilots')

    # Add a Pilot Manager. Pilot managers manage one or more Pilots.
    pmgr = rp.PilotManager(session=session)

    # Define an [n]-core local pilot that runs for [x] minutes
    # Here we use a dict to initialize the description object
    pd_init = {
               'resource'      : resource,
               'runtime'       : 15,  # pilot runtime (min)
               'exit_on_error' : True,
               'project'       : config[resource].get('project', None),
               'queue'         : config[resource].get('queue', None),
               'access_schema' : config[resource].get('schema', None),
               'cores'         : config[resource].get('cores', 1),
               'gpus'          : config[resource].get('gpus', 0),
              }
    pdesc = rp.PilotDescription(pd_init)

    # Launch the pilot.
    pilot = pmgr.submit_pilots(pdesc)


    report.header('submit tasks')

    # Register the Pilot in a TaskManager object.
    tmgr = rp.TaskManager(session=session)
    tmgr.add_pilots(pilot)

    # Create a workload of char-counting a simple file.  We first create the
    # file right here, and then use it as task input data for each task.
    os.system('hostname >  input.dat')
    os.system('date     >> input.dat')

    n = 128   # number of tasks to run
    report.info('create %d task description(s)\n\t' % n)

    tds = list()
    for i in range(0, n):

        # create a new Task description, and fill it.
        # Here we don't use dict initialization.
        td = rp.TaskDescription()
        td.executable     = '/bin/cp'
        td.arguments      = ['-v', 'input.dat', 'output.dat']
        td.input_staging  = ['input.dat']
        td.output_staging = {'source': 'task:///output.dat',
                              'target': 'client:///output_%03d.dat' % i,
                              'action': rp.TRANSFER}

        tds.append(td)
        report.progress()
    report.ok('>>ok\n')

    # Submit the previously created Task descriptions to the
    # PilotManager. This will trigger the selected scheduler to start
    # assigning Tasks to the Pilots.
    tasks = tmgr.submit_tasks(tds)

    # Wait for all tasks to reach a final state (DONE, CANCELED or FAILED).
    report.header('gather results')
    tmgr.wait_tasks()

    report.info('\n')
    for task in tasks:
        report.plain('  * %s: %s, exit: %3s, out: %s\n'
                % (task.uid, task.state[:4],
                    task.exit_code, task.stdout.strip()[:35]))

    # delete the sample input files
    report.info('\nresulting data files:\n\n')
    os.system('COLUMNS=80 ls -w 80 output_*.dat 2>/dev/null')
    os.system('rm output_*.dat')
    os.system('rm input.dat')


except Exception as e:
    # Something unexpected happened in the pilot code above
    report.error('caught Exception: %s\n' % e)
    raise

except (KeyboardInterrupt, SystemExit):
    # the callback called sys.exit(), and we can here catch the
    # corresponding KeyboardInterrupt exception for shutdown.  We also catch
    # SystemExit (which gets raised if the main threads exits for some other
    # reason).
    report.warn('exit requested\n')

finally:
    # always clean up the session, no matter if we caught an exception or
    # not.  This will kill all remaining pilots.
    report.header('finalize')
    session.close()

report.header()
