# Describing Tasks

The notion of tasks is fundamental in RADICAL-Pilot as tasks define the work to be executed on the target resources.  This notebook will guide the user through the various supported task types, and how to specify their respective workload.  It will also show some means to inspect tasks after (successful or failed) execution.

We assume that you are familiar with deploying, configuring and using RADICAL-Pilot, for example by taking the XYZ introduction tutorial.

All examples in this notebook are executed on localhost.  The host needs to have MPI installed - OpenMPI, MPICH, MVAPICH or any other MPI flavor is supported as long as it provides a standards compliant `mpiexec` command.

    # FIXME: reference to intro or getting started

In [3]:
import os
import sys
import pprint

# do not use animated output in notebooks
os.environ['RADICAL_REPORT_ANIME'] = 'False'

import radical.pilot as rp
import radical.utils as ru

# determine the path of the currently active ve to simplify some examples below
ve_path = os.path.dirname(os.path.dirname(ru.which('python3')))
print(ve_path)


/mnt/home/merzky/radical/radical.pilot.work/ve3



### Initial setup and Pilot Submission

Just as demonstrated in the introductory tutorials we will first configure the reporter output, then set up an RP session, create pilot and task manager instances and run a small local pilot with 10 cores and 1 gpu assigned to it.


In [4]:
# configure reporter output 
report = ru.Reporter(name='radical.pilot')
report.title('Tutorial: Describing Tasks (RP version %s)' % rp.version)

# create session and managers
session = rp.Session()
pmgr    = rp.PilotManager(session)
tmgr    = rp.TaskManager(session)

# submit a pilot
pilot = pmgr.submit_pilots(rp.PilotDescription({'resource'     : 'local.localhost', 
                                                'runtime'      : 60, 
                                                'cores'        : 32, 
                                                'gpus'         : 1, 
                                                'exit_on_error': True}))

# add the pilot to the task manager and wait for the pilot to become active
tmgr.add_pilots(pilot)
pilot.wait(rp.PMGR_ACTIVE)
report.info('pilot state: %s' % pilot.state)

[94m[1m
[39m[0m[94m[1m Tutorial: Describing Tasks (RP version 1.22.0)                                 
[39m[0m[94m[1m
[39m[0m[94mnew session: [39m[0m[rp.session.rivendell.merzky.019441.0011][39m[0m[94m                         \
database   : [39m[0m[mongodb://localhost/am][39m[0m[92m                                         ok
[39m[0m[94mcreate pilot manager[39m[0m[92m                                                          ok
[39m[0m[94mcreate task manager[39m[0m[92m                                                           ok
[39m[0m[94msubmit 1 pilot(s)[39m[0m
        pilot.0000   local.localhost          32 cores       1 gpus[39m[0m[92m           ok
[39m[0m[94mpilot state: PMGR_ACTIVE[39m[0m

### Task execution

At this point we have the system set up and ready to execute our workload.  To do so we describe the tasks of which the workload is comprised and submit them for execution.  The goal of this tutorial is to introduce the various attributes available for describing tasks, to explain the execution process in some detail, and to describe how completed or failed tasks can be inspected.

#### RP Executable Tasks vs. Raptor Tasks

RADICAL-Pilot is, in the most general sense, a pilot based task execution backend.  It's implementation focuses on *executable* tasks, i.e., on tasks which are described by an executable, it's command line arguments, in- and output files, and by it's execution environment.  

A more general task execution engine called 'Raptor' is additionally provided as part of RADICAL-Pilot.  Raptor can additionally execute *function* tasks, i.e., tasks which are defined by a function code entry point, function parameters and return values.  This tutorial you are reading right now focuses on the former, *executable* tasks.  Raptor's additionally supported task types are the topic of a separate tutorial which can be found [here](link to raptor tutorial).

#### Task Descriptions

The `rp.TaskDescription` class is, as the name suggests, the basis for all task descriptions in RADICAL-Pilot.  It's most important attribute is `mode`: for *executable* tasks the mode must be set to `rp.TASK_EXECUTABLE` (which is the default setting though).

Executable tasks have exactly one additional required attribute: the name of the executable.  That can be either an absolute path to the executable on the file system of the target resource, or it can be a plain executable name which is known at runtime in the task's execution environment (we will cover the execution environment setup futher down below).

In [5]:
# create a minimal executable task
td   = rp.TaskDescription({'executable': '/bin/date'})
task = tmgr.submit_tasks(td)


submit: [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m
[39m[0m

The task will be scheduled for execution on the pilot we created above.  We now wait for the task to complete, i.e., to reach one of the final states `DONE`, `CANCELED` or `FAILED`:

In [6]:
tmgr.wait_tasks()

wait  : [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m
[39m[0m[94m	DONE      :     1
[39m[0m[92m                                                                              ok
[39m[0m

['DONE']

Congratulations, you successfully executed an RADICAL-Pilot task!

### Task Inspection

Once completed, we can inspect the tasks for details of their execution: we print a summary for all tasks and then inspect one of them in a bit more detail.  The output shows a number of task attributes which can be set by the task description also.  Those are specifically:

  - `uid`: a unique string identifying the task.  If not defined in the task description, RP will generate an ID which is unique within the scope of the current session.
  - `name`: a common name for the task which has no meaning to RP itself but can be used by the application to identify or classify certain tasks.  The task name is not required to be unique.
  - `metadata`: any user defined data.  The only requirement is that the data are serializable via `msgpack` (which RP internally uses as serialization format).  Note that metadata are communicated along with the task itself and as such should usually be very small bits of data.

In [7]:

report.plain('uid             : %s\n' % task.uid)
report.plain('tmgr            : %s\n'
             % task.tmgr.uid)
report.plain('pilot           : %s\n' % task.pilot)
report.plain('name            : %s\n' % task.name)
report.plain('executable      : %s\n' % task.description['executable'])
report.plain('state           : %s\n' % task.state)
report.plain('exit_code       : %s\n' % task.exit_code)
report.plain('stdout          : %s\n' % task.stdout.strip())
report.plain('stderr          : %s\n' % task.stderr)
report.plain('return_value    : %s\n' % task.return_value)
report.plain('exception       : %s\n' % task.exception)
report.plain('\n')
report.plain('endpoint_fs     : %s\n' % task.endpoint_fs)
report.plain('resource_sandbox: %s\n' % task.resource_sandbox)
report.plain('session_sandbox : %s\n' % task.session_sandbox)
report.plain('pilot_sandbox   : %s\n' % task.pilot_sandbox)
report.plain('task_sandbox    : %s\n' % task.task_sandbox)
report.plain('client_sandbox  : %s\n' % task.client_sandbox)
report.plain('metadata        : %s\n' % task.metadata)


uid             : task.000000
[39m[0mtmgr            : tmgr.0000
[39m[0mpilot           : pilot.0000
[39m[0mname            : 
[39m[0mexecutable      : /bin/date
[39m[0mstate           : DONE
[39m[0mexit_code       : 0
[39m[0mstdout          : Sat Mar 25 22:30:38 CET 2023
[39m[0mstderr          : 
[39m[0mreturn_value    : None
[39m[0mexception       : None
[39m[0m
[39m[0mendpoint_fs     : file://localhost/
[39m[0mresource_sandbox: file://localhost/home/merzky/radical.pilot.sandbox
[39m[0msession_sandbox : file://localhost/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011
[39m[0mpilot_sandbox   : file://localhost/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011/pilot.0000/
[39m[0mtask_sandbox    : file://localhost/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011/pilot.0000/task.000000/
[39m[0mclient_sandbox  : /mnt/home/merzky/radical/radical.pilot.work/docs/source/tutorials
[39m

All applications can fail, often for reasons out of control of the user. A Task is no different, it can fail as well. Many non-trivial application will need to have a way to handle failing tasks – detecting the failure is the first and necessary step to do so, and RP makes that part easy: RP’s task state model defines that a failing task will immediately go into FAILED state, and that state information is available as task.state property.

The task also has the task.stderr property available for further inspection into causes of the failure – that will only be available though if the task did reach the EXECUTING state in the first place.

Let's submit a new set of tasks and inspect the failure modes: we will scan `/bin/date` for acceptable single letter arguments:

In [8]:
import string
letters = string.ascii_lowercase + string.ascii_uppercase

report.progress_tgt(len(letters), label='create')

tds = list()
for letter in letters:
    tds.append(rp.TaskDescription({'executable': '/bin/date',
                                   'arguments': ['-' + letter]}))
    report.progress()

report.progress_done()

tasks = tmgr.submit_tasks(tds)

create: [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m
[39m[0msubmit: [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m

This time we wait only for the newly submitted tasks and check which ones succeeded - for those we check the resulting output - we will find exactly 3 valid single-letter options.

In [9]:
tmgr.wait_tasks([task.uid for task in tasks])

for task in tasks:
    if task.state == rp.DONE:
        print('%s: %s: %s' % (task.uid, task.description['arguments'], task.stdout.strip()))


wait  : [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m
[39m[0m[94m	DONE      :     3
[39m[0m[94m	FAILED    :    49
[39m[0m[92m                                                                              ok
[39m[0m

task.000021: ['-u']: Sat Mar 25 21:30:47 UTC 2023
task.000035: ['-I']: 2023-03-25
task.000044: ['-R']: Sat, 25 Mar 2023 22:30:47 +0100


### EXERCISE 1

Change the code of the cell above to show the stderr of tasks which did *not* end up in `DONE` state.

### MPI Tasks and Task Resources

In the examples we have, so far, been running single-core tasks.  By expanding the task description to include the `ranks` attribute, we can request multiple MPI ranks to be created where each rank uses a certain amount of resources:

  - `cores_per_rank`: the number of cores each rank can user for spawning additional threads or processes
  - `gpus_per_rank`: the number of GPUs each rank can utilize
  - `mem_per_rank`: the size of memory (in Megabytes) which is available to each rank
  - `lfs_per_rank`: the amount of node-local file storage which is available to each rank
  - `threading_type`: how to inform the application about available resources to run threads on
    - `rp.OpenMP`: define `OMP_NUM_THREADS` in the task environment
  - `gpu_type`: how to inform the application about available GPU resources
    - `rp.CUDA`: define `CUDA_VISIBLE_DEVICES` in the task environment

We use the `radical-pilot-hello.sh` command as a test to report on rank creation.  Note though that no core pinning is performed on localhost, and the tasks thus see all CPU cores as available to them.  However the `THREADS` information still reports the correct number of assigned CPU cores.

In [10]:
tds = list()
for n in range(4):
    tds.append(rp.TaskDescription({'executable'    : ve_path + '/bin/radical-pilot-hello.sh',
                                   'arguments'     : [n + 1], 
                                   'ranks'         : (n + 1), 
                                   'cores_per_rank': (n + 1),
                                   'threading_type': rp.OpenMP}))
    report.progress()

report.progress_done()

tasks = tmgr.submit_tasks(tds)
tmgr.wait_tasks([task.uid for task in tasks])

for task in tasks:
    print('--- %s:\n%s\n' % (task.uid, task.stdout.strip()))


.[39m[0m.[39m[0m.[39m[0m.[39m[0m
[39m[0msubmit: [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m
[39m[0mwait  : [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m

--- task.000053:
0 : PID     : 28785
0 : NODE    : rivendell
0 : CPUS    : 1111111111111111
0 : GPUS    : 00
0 : RANK    : 0
0 : THREADS : 1
0 : SLEEP   : 1

--- task.000054:
1 : PID     : 28754
1 : NODE    : rivendell
1 : CPUS    : 1111111111111111
1 : GPUS    : 00
1 : RANK    : 1
1 : THREADS : 2
1 : SLEEP   : 2
0 : PID     : 28756
0 : NODE    : rivendell
0 : CPUS    : 1111111111111111
0 : GPUS    : 00
0 : RANK    : 0
0 : THREADS : 2
0 : SLEEP   : 2

--- task.000055:
0 : PID     : 28637
0 : NODE    : rivendell
0 : CPUS    : 1111111111111111
0 : GPUS    : 00
0 : RANK    : 0
0 : THREADS : 3
0 : SLEEP   : 3
1 : PID     : 28631
1 : NODE    : rivendell
1 : CPUS    : 1111111111111111
1 : GPUS    : 00
1 : RANK    : 1
1 : THREADS : 3
1 : SLEEP   : 3
2 : PID     : 28760
2 : NODE    : rivendell
2 : CPUS    : 1111111111111111
2 : GPUS    : 00
2 : RANK    : 2
2 : THREADS : 3
2 : SLEEP   : 3

--- task.000056:
1 : PID     : 28607
1 : NODE    : rivendell
1 : CPUS    : 1111111111111111
1 : GPUS    : 

### Task Data Management

The `TaskDescription` supports diverse means to specify the task's data dependencies and data related properties:

  - `stdout`: path of the file to store the task's standard output in  
  - `stderr`: path of the file to store the task's standard error in
  - `input_staging`: list of file staging directives to stage task input data
  - `output_staging`: list of file staging directives to stage task output data
  
Let us run an example task which uses those 4 attributes: we run a word count on `/etc/passwd` (which we stage as input file) and store the result in an output file (which we fetch back).  We will also stage back the files in which standard output and standard error are stored (although in this simple example both are expected to be empty).
  

In [14]:

td = rp.TaskDescription({'executable'    : '/bin/sh',
                         'arguments'     : ['-c', 'cat input.dat | wc > output.dat'],
                         'stdout'        : 'task_io.out',
                         'stderr'        : 'task_io.err',
                         'input_staging' : [{'source': '/etc/passwd', 'target': 'input.dat'}],
                         'output_staging': [{'source': 'output.dat',  'target': '/tmp/output.test.dat'},
                                            {'source': 'task_io.out', 'target': '/tmp/output.test.out'},
                                            {'source': 'task_io.err', 'target': '/tmp/output.test.err'}]
                        })
task = tmgr.submit_tasks(td)
tmgr.wait_tasks([task.uid])

# let's check the resulting output files
print(ru.sh_callout('ls -la /tmp/output.test.*', shell=True)[0])
print(ru.sh_callout('cat    /tmp/output.test.dat')[0])

submit: [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m
[39m[0mwait  : [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m

-rw-rw-r-- 1 merzky merzky 24 Mar 25 22:33 /tmp/output.test.dat
-rw-rw-r-- 1 merzky merzky  0 Mar 25 22:33 /tmp/output.test.err
-rw-rw-r-- 1 merzky merzky  0 Mar 25 22:33 /tmp/output.test.out

     55      96    3294



The RP data staging capabilities go beyond what is captured in the example above:

  - data can be transferred, copied, moved and linked
  - data can refer to absolute paths, or are specified relative to the systems root file system, to RP's resource sandbox, session sandbox, pilot sandbox or task sandbox
  - data staging can be performed not only for tasks, but also for the overall workflow (for example, when many tasks share the same input data)
  
More details on data staging are [documented](here) and are covered in another [tutorial](here). 

### Task Execution Environment

Specifically on HPC systems it is common to provide application executables via environment modules.  But task execution environments are also frequently used for scripting languages such as Python (virtualenv, venv, conda etc).  RP supports the setup of the task execution environment in the following ways:

  1. `environment` dictionary
  3. use `pre_exec` directives to customize task specific environments
  2. prepare and reuse named environments for tasks
  
We will cover these options in the next three examples.

#### Environment Dictionary

Environment variables can be set explicitly in the task description via the `environment` attribute.  When that attribute is not specified, the tasks will be executed in the original environment the pilot found when being placed on the compute nodes.  If the attribute is defined, then the original environment will be augmented by the settings thus specified.  Note that a number of custom environment variables are always provided, such as the various sandbox locations known to RP, as demonstrated below:

In [20]:
td = rp.TaskDescription({'executable' : '/bin/sh',
                         'arguments'  : ['-c', 'echo "$FOO - $BAR - $SHELL"; env | grep RP_ | sort'],
                         'environment': {'FOO': 'foo', 'BAR': 'bar'}
                        })
task = tmgr.submit_tasks(td)
tmgr.wait_tasks([task.uid])
print(task.stdout)

submit: [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m
[39m[0mwait  : [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m

foo - bar - /bin/bash
RP_APP_TUNNEL_ADDR=144.76.72.175:27017
RP_BOOTSTRAP_0_REDIR=True
RP_GTOD=/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011//pilot.0000//gtod
RP_PILOT_ID=pilot.0000
RP_PILOT_SANDBOX=/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011//pilot.0000/
RP_PROF=/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011//pilot.0000//prof
RP_PROF_TGT=/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011//pilot.0000//task.000065/task.000065.prof
RP_RANK=0
RP_RANKS=1
RP_RESOURCE=local.localhost
RP_RESOURCE_SANDBOX=/home/merzky/radical.pilot.sandbox
RP_SESSION_ID=rp.session.rivendell.merzky.019441.0011
RP_SESSION_SANDBOX=/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011/
RP_TASK_ID=task.000065
RP_TASK_NAME=task.000065
RP_TASK_SANDBOX=/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011//pilot.0000//task.000065



#### Environment Setup with `pre_exec`

The `pre_exec` attribute of the task description can be used to specify a set of shell commands which are to be executed before the actual task is launched.  That mechanism can be used to prepare the task's runtime environment, for example to

  - load a system module
  - export some environment variable
  - run a shell script or shell commands
  - activate some virtualen environment
  
The example shown below assumes a Python virtualenv exists in `/tmp/ve` - we will activate that virtualenv and install a Python module (`radical.gtod`) in it:

In [24]:
td = rp.TaskDescription({'executable' : '/bin/sh',
                         'arguments'  : ['-c', 'which python3; pip list | grep psutil'],
                         'pre_exec'   : ['. /tmp/ve/bin/activate', 
                                         'pip install psutil']
                        })
task = tmgr.submit_tasks(td)
tmgr.wait_tasks([task.uid])
print(task.stdout)

submit: [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m
[39m[0mwait  : [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m

Collecting psutil
  Using cached psutil-5.9.4-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (280 kB)
Installing collected packages: psutil
Successfully installed psutil-5.9.4
/tmp/ve/bin/python3
psutil        5.9.4



#### Environment Setup with `named_env`

When the same environment is used for many tasks, then the collective sum of theit `pre_exec` activities can create a significant runtime overhead, both on the shared filesystem and also on the system load.  The 3rd environment setup option mentioned above addresses that problem: applications can prepare a task environment and the use the `named_env` attribute to exctivate it for the task.  This process is *very* lightweight on system load and runtime overhead, but has some limitations we will explain after the example snippet below:

In [25]:

pilot.prepare_env(env_name='test_env', 
                  env_spec={'type' :  'venv', 
                            'setup': ['psutil']})

td = rp.TaskDescription({'executable' : '/bin/sh',
                         'arguments'  : ['-c', 'which python3; pip list | grep psutil'],
                         'named_env'  : 'test_env'
                        })
task = tmgr.submit_tasks(td)
tmgr.wait_tasks([task.uid])
print(task.stdout)

submit: [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m
[39m[0mwait  : [39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m[0m#[39m

/home/merzky/radical.pilot.sandbox/rp.session.rivendell.merzky.019441.0011/pilot.0000/env/rp_named_env.test_env/bin/python3
psutil             5.9.4



In [26]:
report.header('finalize')
session.close()

[93m[1m
[39m[0m[93m[1m--------------------------------------------------------------------------------
[39m[0m[93m[1mfinalize                                                                        
[39m[0m[93m[1m
[39m[0m[94mclosing session rp.session.rivendell.merzky.019441.0011[39m[0m[94m                        \
close task manager[39m[0m[92m                                                            ok
[39m[0m[94mclose pilot manager[39m[0m[94m                                                            \
wait for 1 pilot(s)
        [39m[0m[92m                                                                      ok
[39m[0m[92m                                                                              ok
[39m[0m[94msession lifetime: 3954.8s[39m[0m[92m                                                     ok
[39m[0m