# RADICAL-Cybertools: RADICAL-EnTK Tutorial

One has to handle RADICAL-EnTK applications with some care when running them in a Jupyter notebook.  In particular one should avoid to run cells out of order.  It is usually best to cleanly terminate the kernel before rerunning any / all cells.  This notebook thus puts the exercise code into a *single* cell which you can edit freely and then execute.

## Exercise 1: Change the number of ensemble members (number of pipelines) and number of simulations per pipeline 
  - Your are presumably running on a small resource - be gentle ;-)
  - Look at the `for` loop in the program's `main` section
  - Look at the construction of Stage 2 (`s2`)


In [None]:
%env RADICAL_LOG_LVL=OFF
%env RADICAL_REPORT=TRUE
%env RADICAL_REPORT_ANIME=FALSE

In [None]:
import radical.entk as re


def get_stage_1(sandbox):
    '''
    first stage: create 1 task to generate a random seed number
    '''
    
    s1 = re.Stage()

    t1 = re.Task()
    t1.executable = '/bin/sh'
    t1.arguments  = ['-c', 'od -An -N1 -i /dev/random']
    t1.stdout     = 'random.txt'
    t1.sandbox    = sandbox

    s1.add_tasks(t1)
    return s1


def get_stage_2(sandbox):
    '''
    second stage: create 10 tasks to compute the n'th power of a random seed
    '''
    
    s2 = re.Stage()

    n_simulations = 5
    for i in range(n_simulations):
        t2 = re.Task()
        t2.executable = '/bin/sh'
        t2.arguments  = ['-c', "echo '$(cat random.txt) ^ %d' | bc" % i]
        t2.stdout     = 'power.%03d.txt' % i
        t2.sandbox    = sandbox
        s2.add_tasks(t2)
    
    return s2


def get_stage_3(sandbox):
    '''
    third stage: compute sum over all powers
    '''
    
    s3 = re.Stage()

    t3 = re.Task()
    t3.executable = '/bin/sh'
    t3.arguments  = ['-c', 'cat power.*.txt | paste -sd+ | bc']
    t3.stdout     = 'sum.txt'
    t3.sandbox    = sandbox

    # download the result while renaming to get unique files per pipeline
    t3.download_output_data = ['sum.txt > %s.sum.txt' % sandbox]
    
    s3.add_tasks(t3)
    return s3


def generate_pipeline(uid):
    '''
    Generate a single simulation pipeline, i.e., a new ensemble member.
    The pipeline structure consisting of three steps as described above.
    '''

    # all tasks in this pipeline share the same sandbox
    sandbox = uid

    # assemble three stages into a pipeline and return it
    p = re.Pipeline()
    p.add_stages([get_stage_1(sandbox), 
                  get_stage_2(sandbox), 
                  get_stage_3(sandbox)])

    return p

appman = re.AppManager()

appman.resource_desc = {
    'resource': 'local.localhost',
    'walltime': 10,
    'cpus'    : 2
}

n_pipelines = 2

ensemble = set()
for cnt in range(n_pipelines):
    ensemble.add(generate_pipeline(uid='pipe.%03d' % cnt))

appman.workflow = ensemble
appman.run()

for cnt in range(n_pipelines):
    data = open('pipe.%03d.sum.txt' % cnt).read()
    print('%3d -- %25d' % (cnt, int(data)))