## Introduction

In this notebook, we will share how to create an ESMACS protocol using the EnTK API. 
We will use a 48k atom system (BRD4-GSK) and assign each simulation to use the NAMD engine

**Note**: This notebook is not to be executed. The notebook format allows us to explain  
step-by-step. An example that you can run is placed in '../examples/',  please see  
the pre-reqs to run the example in the sa folder.

We know the ESMACS protocol can be represented as:

<img style="float: left;" src="./BioExcel_HTBAC.jpg">

Let's set start with something simple!

We will start with the following constraints/assumptions and extend  
as we go forward:

* Number of replicas (simulation tasks) = 25
* Each simulation runs on 16 cores 

We will perform the following steps:

1. Create a single Pipeline with 4 Stages
2. Create stages with the same number of simulation Tasks (25) 
4. Describe a resource to use for execution
5. Submit the Pipeline for execution

**Let's start with the code!**

## Encoding the ESMACS protocol 

### Step 0 -- Importing the API objects and set our constants

In [None]:
from radical.entk import Pipeline, Stage, Task, AppManager
# AppManager accepts the Pipeline and resource specification.

num_replicas  = 25
cores_per_simulation = 16

### Step 1 -- Create a Pipeline

In [None]:
esmacs_pipe = Pipeline()
esmacs_pipe.name = 'esmacs'
# Names are user-defined, but required when we need to reference data 
# across tasks. We will see how in a few minutes. 

### Step 2 -- Create Stage 1 (Minimization) 

It is important to add the Minimization Stage to the ESMACS Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
min_stage = Stage()
min_stage.name = 'min'
for x in range(num_replicas):
    min_task = create_min_task(x)
    # We define 'create_min_task()' below

    
    min_stage.add_tasks(min_task)
    # The order in which the tasks are created and added to 
    # a Stage is not important since they all run concurrently.
    
esmacs_pipe.add_stages(min_stage)

### Defining the functions that create the tasks

For the all tasks, we will specify the following properties:

* executable
* arguments
* input data

* additional attributes (resource requirements) 


In [None]:
def create_min_task(ind):
    min_task = Task()
    min_task.name = 'min_%s'%ind
    min_task.executable = ['namd2']
    min_task.arguments = ['esmacs-stage-1.conf'] 
    min_task.cores = cores_per_simulation
    min_task.copy_input_data = []
    return min_task

### Step 2 -- Create Stage 2 (Equilibration 1) 

It is important to add the Equilibration Stage to the ESMACS Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
eq1_stage = Stage()
eq1_stage.name = 'eq1'
for x in range(num_replicas):
    eq1_task = create_eq1_task()
    # We define 'create_eq1_task()' below

    
    eq1_stage.add_tasks(eq1_task)
    # The order in which the tasks are created and added to 
    # a Stage is not important since they all run concurrently.
    
esmacs_pipe.add_stages(eq1_stage)

In [None]:
def create_eq1_task(ind):
    eq1_task = Task()
    eq1_task.name = 'eq1_%s'%ind
    eq1_task.executable = ['namd2']
    eq1_task.arguments = ['esmacs-stage-2.conf'] 
    eq1_task.cores = cores_per_simulation
    eq1_task.copy_input_data = []
    return eq1_task

### Step 3 -- Create Stage 3 (Equilibration 2) 

It is important to add the Equilibration 2 Stage to the ESMACS Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
eq2_stage = Stage()
eq2_stage.name = 'eq2'
for x in range(num_replicas):
    eq2_task = create_eq2_task()
    # We define 'create_eq2_task()' below

    
    eq2_stage.add_tasks(eq2_task)
    # The order in which the tasks are created and added to 
    # a Stage is not important since they all run concurrently.
    
esmacs_pipe.add_stages(eq2_stage)

In [None]:
def create_eq2_task(ind):
    eq2_task = Task()
    eq2_task.name = 'eq2_%s'%ind
    eq2_task.executable = ['namd2']
    eq2_task.arguments = ['esmacs-stage-3.conf'] 
    eq2_task.cores = cores_per_simulation
    eq2_task.copy_input_data = []
    return eq2_task

### Step 4 -- Create Stage 4 (Production MD) 

It is important to add the MD Stage to the ESMACS Pipeline first!  
EnTK will respect the order you specify within a Pipeline.

In [None]:
md_stage = Stage()
md_stage.name = 'md'
for x in range(num_replicas):
    md_task = create_md_task()
    # We define 'create_md_task()' below

    
    md_stage.add_tasks(md_task)
    # The order in which the tasks are created and added to 
    # a Stage is not important since they all run concurrently.
    
esmacs_pipe.add_stages(md_stage)

In [None]:
def create_md_task(ind):
    md_task = Task()
    md_task.name = 'md_%s'%ind
    md_task.executable = ['namd2']
    md_task.arguments = ['esmacs-stage-4.conf'] 
    md_task.cores = cores_per_simulation
    md_task.copy_input_data = []
    return md_task

### Step 5 -- Describe a resource to use for execution

Create a dictionary describing three mandatory keys: **resource**,  
**walltime**, and **cpus**. 

Note that **resource** is 'local.localhost' to execute locally. At the end of  
this notebook, we have provided a link to the resources supported by default  
along with documentation on how new resources can be supported.

In [None]:
resource_desc = {

        'resource': 'local.localhost',
        'walltime': 10,
        'cpus': 2
    }

### Step 5 -- Submit the Pipeline for execution

We create an AppManager object, provide with the resource description and  
the ESMACS Pipeline. We then run our application.

In [None]:
amgr = AppManager()
amgr.resource_desc = resource_desc
amgr.workflow = [esmacs_pipe]
amgr.run()

We are almost done with our complete EnTK script! 

The above five steps create our ESMACS protocol with EnTK.  
We create one Pipeline with four Stages: minimization, equilibration 1, equilibration 2, production MD  
Each stage consists of 25 tasks. 

Now let's complete the script by specifying the task creation functions.

## That's all folks!

## Additional Information:

* EnTK Documentation: https://radicalentk.readthedocs.io/
* EnTK Repository: https://github.com/radical-cybertools/radical.entk
* Installation instructions: https://radicalentk.readthedocs.io/en/latest/install.html
* User Guide: https://radicalentk.readthedocs.io/en/latest/user_guide.html
* Adaptive examples: https://radicalentk.readthedocs.io/en/latest/advanced_examples.html

Please feel free to ask any questions now or drop us a question either via our 
[mailing list](https://groups.google.com/d/forum/ensemble-toolkit-users) or [GitHub issues](https://github.com/radical-cybertools/radical.entk/issues)