## Introduction

In this notebook, we will share how to create two protocols (ESMACS and TIES) using the EnTK API. 
Each protocol will run simulations for different physical systems. Here, we will use two different 48k atom systems from the BRD4-GSK library and assign each simulation to use the NAMD engine.

**Note**: This notebook is not to be executed. The notebook format allows us to explain  
step-by-step. An example that you can run is placed in 'radical.entk/examples/'. 

Let's set start with something simple!

We will start with the following constraints/assumptions and extend  
as we go forward:

* Number of replicas (ESMACS) = 25
* Number of replicas (TIES) = 5
* Number of lambda windows = 13
* Each simulation runs on 16 cores 

For each ESMACS/TIES protocol we will perform the following steps:

1. Create a single Pipeline with 4 Stages
2. Create stages with the same number of Tasks (25 for ESMACS, 65 for TIES) 
4. Describe a resource to use for execution
5. Submit the Pipeline for execution

**Let's start with the code!**

## Encoding the ESMACS protocol 

### Step 0 -- Importing the API objects and set our constants

In [None]:
from radical.entk import Pipeline, Stage, Task, AppManager
# AppManager accepts the Pipeline and resource specification.

num_replicas_esmacs  = 25
cores_per_simulation = 16

### Step 1 -- Create an ESMACS Pipeline For System: BRD_GSK_3_4

In [None]:
esmacs_pipe = Pipeline()
esmacs_pipe.name = 'esmacs_brd_gsk_3_4'
# Names are user-defined, but required when we need to reference data 
# across tasks. We will see how in a few minutes. 

### Step 2 -- Create Stage 1 (Minimization) 

It is important to add the Minimization Stage to the ESMACS Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
esmacs_min_stage = Stage()
esmacs_min_stage.name = 'esmacs-min-brd-gsk-3-4'
for x in range(num_replicas_esmacs):
    esmacs_min_task = create_esmacs_min_task(x)
    # We define 'create_min_task()' below

    
    esmacs_min_stage.add_tasks(esmacs_min_task)
    # The order in which the tasks are created and added to 
    # a Stage is not important since they all run concurrently.
    
esmacs_pipe.add_stages(esmacs_min_stage)

### Defining the functions that create the tasks

For the all tasks, we will specify the following properties:

* executable
* arguments
* input data

* additional attributes (resource requirements) 


In [None]:
def create_esmacs_min_task(ind):
    esmacs_min_task = Task()
    esmacs_min_task.name = 'esmacs_min_%s'%ind
    esmacs_min_task.executable = ['namd2']
    esmacs_min_task.arguments = ['esmacs-brd-gsk-3-4-stage-1.conf'] 
    esmacs_min_task.cores = cores_per_simulation
    esmacs_min_task.copy_input_data = []
    return esmacs_min_task

### Step 2 -- Create Stage 2 (Equilibration 1) 

It is important to add the Equilibration Stage to the ESMACS Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
esmacs_eq1_stage = Stage()
esmacs_eq1_stage.name = 'esmacs-eq1-brd-gsk-3-4'
for x in range(num_replicas_esmacs):
    esmacs_eq1_task = create_esmacs_eq1_task()
    # We define 'create_esmacs_eq1_task()' below

    
    esmacs_eq1_stage.add_tasks(esmacs_eq1_task)
    # The order in which the tasks are created and added to 
    # a Stage is not important since they all run concurrently.
    
esmacs_pipe.add_stages(esmacs_eq1_stage)

In [None]:
def create_esmacs_eq1_task(ind):
    esmacs_eq1_task = Task()
    esmacs_eq1_task.name = 'esmacs_eq1_%s'%ind
    esmacs_eq1_task.executable = ['namd2']
    esmacs_eq1_task.arguments = ['esmacs-brd-gsk-3-4-stage-2.conf'] 
    esmacs_eq1_task.cores = cores_per_simulation
    esmacs_eq1_task.copy_input_data = []
    return esmacs_eq1_task

### Step 3 -- Create Stage 3 (Equilibration 2) 

It is important to add the Equilibration 2 Stage to the ESMACS Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
esmacs_eq2_stage = Stage()
esmacs_eq2_stage.name = 'esmacs-eq2-brd-gsk-3-4'
for x in range(num_replicas_esmacs):
    esmacs_eq2_task = create_esmacs_eq2_task()
    # We define 'create_esmacs_eq2_task()' below

    
    esmacs_eq2_stage.add_tasks(esmacs_eq2_task)
    # The order in which the tasks are created and added to 
    # a Stage is not important since they all run concurrently.
    
esmacs_pipe.add_stages(esmacs_eq2_stage)

In [None]:
def create_esmacs_eq2_task(ind):
    esmacs_eq2_task = Task()
    esmacs_eq2_task.name = 'esmacs_eq2_%s'%ind
    esmacs_eq2_task.executable = ['namd2']
    esmacs_eq2_task.arguments = ['esmacs-brd-gsk-3-4-stage-3.conf'] 
    esmacs_eq2_task.cores = cores_per_simulation
    esmacs_eq2_task.copy_input_data = []
    return esmacs_eq2_task

### Step 4 -- Create Stage 4 (Production MD) 

It is important to add the MD Stage to the ESMACS Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
esmacs_md_stage = Stage()
esmacs_md_stage.name = 'esmacs-md-brd-gsk-3-4'
for x in range(num_replicas_esmacs):
    esmacs_md_task = create_esmacs_md_task()
    # We define 'create_esmacs_md_task()' below

    
    esmacs_md_stage.add_tasks(esmacs_md_task)
    # The order in which the tasks are created and added to 
    # a Stage is not important since they all run concurrently.
    
esmacs_pipe.add_stages(esmacs_md_stage)

In [None]:
def create_esmacs_md_task(ind):
    esmacs_md_task = Task()
    esmacs_md_task.name = 'esmacs_md_%s'%ind
    esmacs_md_task.executable = ['namd2']
    esmacs_md_task.arguments = ['esmacs-brd-gsk-3-4-stage-4.conf'] 
    esmacs_md_task.cores = cores_per_simulation
    esmacs_md_task.copy_input_data = []
    return esmacs_md_task

## Encoding the TIES protocol 

### Step 0 -- Set our constants


In [None]:
num_replicas_ties  = 5
num_lambda_windows  = 13
cores_per_simulation = 16

### Step 1 -- Create an TIES Pipeline For System: BRD_GSK_3_7


In [None]:
ties_pipe = Pipeline()
ties_pipe.name = 'ties_brd_gsk_3_7'
# Names are user-defined, but required when we need to reference data 
# across tasks. We will see how in a few minutes. 

### Step 2 -- Create Stage 1 (Equilibration) 

It is important to add the Equilibration Stage to the TIES Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
ties_eq_stage = Stage()
ties_eq_stage.name = 'ties-eq-brd-gsk-3-7'
for x in range(num_replicas_ties):
    for y in range(num_lambda_windows): 
        ties_eq_task = create_ties_eq_task(x,y)
        # We define 'create_ties_eq_task()' below
    
        ties_eq_stage.add_tasks(ties_eq_task)
        # The order in which the tasks are created and added to 
        # a Stage is not important since they all run concurrently.

ties_pipe.add_stages(ties_eq_stage)

In [None]:
def create_ties_eq_task(replica,lambda_value):
    ties_eq_task = Task()
    ties_eq_task.name = 'ties_eq_rep_%s_lambda_%s'%replica,lambda_value
    ties_eq_task.executable = ['namd2']
    ties_eq_task.arguments = ['ties-brd-gsk-3-7-stage-1.conf'] 
    ties_eq_task.cores = cores_per_simulation
    ties_eq_task.copy_input_data = []
    return ties_eq_task

### Step 3 -- Create Stage 2 (Production MD) 

It is important to add the Production MD Stage to the TIES Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
ties_md_stage = Stage()
ties_md_stage.name = 'ties-md-brd-gsk-3-7'
for x in range(num_replicas_ties):
    for y in range(num_lambda_windows): 
        ties_md_task = create_ties_md_task(x,y)
        # We define 'create_ties_md_task()' below
    
        ties_md_stage.add_tasks(ties_md_task)
        # The order in which the tasks are created and added to 
        # a Stage is not important since they all run concurrently.

ties_pipe.add_stages(ties_md_stage)

In [None]:
def create_ties_md_task(replica,lambda_value):
    ties_md_task = Task()
    ties_md_task.name = 'ties_md_rep_%s_lambda_%s'%replica,lambda_value
    ties_md_task.executable = ['namd2']
    ties_md_task.arguments = ['ties-brd-gsk-3-7-stage-2.conf'] 
    ties_md_task.cores = cores_per_simulation
    ties_md_task.copy_input_data = []
    return ties_md_task

### Describe a resource to use for execution

Create a dictionary describing three mandatory keys: **resource**,  
**walltime**, and **cpus**. 

Note that **resource** is 'local.localhost' to execute locally. At the end of  
this notebook, we have provided a link to the resources supported by default  
along with documentation on how new resources can be supported.

In [None]:
resource_desc = {

        'resource': 'local.localhost',
        'walltime': 10,
        'cpus': 2
    }

### Submit the ESMACS and TIES Pipelines for execution

We create an AppManager object, provide with the resource description and  
the ESMACS and TIES Pipelines. We then run our application.

In [None]:
amgr = AppManager()
amgr.resource_desc = resource_desc
amgr.workflow = set([esmacs_pipe,ties_pipe])
amgr.run()

We are almost done with our complete EnTK script! 

The above steps create our ESMACS and TIES protocols with EnTK.  
For ESMACS: we create one Pipeline with four Stages: minimization, equilibration 1, equilibration 2, production MD  
Each stage consists of 25 tasks. 

For TIES: we create one Pipeline with two Stages: equilibration, production MD  
Each stage consists of 65 tasks. 

Now let's complete the script by specifying the task creation functions.

## That's all folks!

## Additional Information:

* EnTK Documentation: https://radicalentk.readthedocs.io/
* EnTK Repository: https://github.com/radical-cybertools/radical.entk
* Installation instructions: https://radicalentk.readthedocs.io/en/latest/install.html
* User Guide: https://radicalentk.readthedocs.io/en/latest/user_guide.html
* Adaptive examples: https://radicalentk.readthedocs.io/en/latest/advanced_examples.html

Please feel free to ask any questions now or drop us a question either via our 
[mailing list](https://groups.google.com/d/forum/ensemble-toolkit-users) or [GitHub issues](https://github.com/radical-cybertools/radical.entk/issues)