# Molecular Dynamics Lite workflow
This notebook implements a simple molecular dynamics workflow to demonstrate [Parsl Python parallel scripting](https://parsl-project.org/) in a Jupyter notebook.

## Step 1: Define workflow inputs
This PW workflow can be either launched from its form in the `Compute` tab or it can be run directly in this notebook.  If running directly from the notebook, the user needs to go through the extra step of defining the inputs of the workfow in the notebook.

In [None]:
import os
from os.path import exists

print('Define workflow inputs...')

# Start assuming workflow is launched from the form.
run_in_notebook=False

if (exists("./params.run")):
    print("Running from a PW form.")
    
else:
    print("Running from a notebook.")
    
    # Set flag for later
    run_in_notebook=True
    
    #TO DO: AUTOMATE THE PROCESS OF GRABING PW.CONF.
    
    # Manually set workflow inputs here
    #params="npart;input;25:50:25|steps;input;3000:6000:3000|mass;input;0.01:0.02:0.01|trsnaps;input;5:10:5|"
    params="npart;input;25:50:25|steps;input;3000:6000:3000|mass;input;0.01|trsnaps;input;5|"
    
    print(params)
    
    # Write to params.run
    with open("params.run","w") as f:
        n_char_written = f.write(params+"\n")

## Step 2: Configure Parsl
The molecular dynamics software itself is a lightweight, precompiled executable written in C. The executable is distributed with this workflow in `./models/mdlite`, and along with input files, it is staged to the remote resources and does not need to be preinstalled.

The core visualization tool used here is a precompiled binary of [c-ray](https://github.com/vkoskiv/c-ray) distributed with this workflow in `./models/c-ray`. The executable is staged to remote resources and does not need to be preinstalled.

In addition to a Miniconda environment containing Parsl, the only other dependency of this workflow is ImageMagick's `convert` tool for image format conversion (`.ppm` to `.png`) and building animated `.gif` files from `.png` frames.

In [None]:
print("Configuring Parsl...")
#from glob import glob

import parsl
from parsl.app.app import python_app, bash_app
from parsl.data_provider.files import File
from path import Path
from parslpw import pwconfig,pwargs

if (not run_in_notebook):
    print(pwargs)

parsl.load(pwconfig)
print("pwconfig loaded")

## Step 3: Define Parsl workflow apps
These apps are decorated with Parsl's `@bash_app` and as such are executed in parallel on the compute resources that are defined in the PW configuration loaded above.  Functions that are **not** decorated are not executed in parallel on remote resources. The files that need to be staged to remote resources will be marked with Parsl's `File()` (or its PW extension, `Path()`) in the workflow.

In [None]:
print("Defining Parsl workflow apps...")

@bash_app
def md_run(stdout='md.run.stdout', stderr='md.run.stderr', inputs=[], outputs=[]):
    return '''
    %s/runMD.sh "%s" metric.out trj.out
    outdir=%s
    mkdir -p $outdir
    mv trj.out $outdir/
    mv metric.out $outdir/
    ''' % (inputs[1],inputs[0],outputs[0])

@bash_app
def md_vis(stdout='md.vis.stdout', stderr='md.vis.stderr', inputs=[], outputs=[]):
    return '''
    frame=%s
    outdir=%s
    %s/renderframe $outdir/trj.out $outdir/f_$frame.ppm $frame
    ''' % (inputs[0],outputs[0],inputs[1])

#==================================================================================
# Experimenting with this app to speedup workflow:
@bash_app
def md_vis_2(stdout='md.vis.stdout', stderr='md.vis.stderr', inputs=[], outputs=[]):
    return '''
    %s/renderframe %s %s %s
    ''' % (inputs[1],inputs[2],outputs[0],inputs[0])

## Step 4: Workflow
This cell executes the workflow itself.

In [None]:
print("Running workflow...")

#============================================================================
# SETUP PARAMETER SWEEP
#============================================================================
# Generate a case list from params.run (the ranges to parameters to sweep)
os.system("python ./models/mexdex/prepinputs.py params.run cases.list")

# Each line in cases.list is a unique combination of the parameters to sweep.
with open("cases.list","r") as f:
    cases_list = f.readlines()

#============================================================================
# SIMULATE
#============================================================================
# For each line in cases.list, run and visualize a molecular dynamics simulation
# These empty lists will store the futures of Parsl-parallelized apps.
# Use Path for staging because multiple files in ./models/mdlite are needed
# and mutliple files in ./results/case_* are sent back to the platform.
md_run_fut = []
for ii, case in enumerate(cases_list):        
    # Run simulation
    md_run_fut.append(md_run(
        inputs=[case,
            Path("./models/mdlite")],
        outputs=[Path("./results/case_"+str(ii))]))
    
# Call results for all app futures to require
# execution to wait for all processes to complete.
for run in md_run_fut:
    run.result()

#============================================================================
# VISUALIZE
#============================================================================
md_vis_fut = []
for ii, case in enumerate(cases_list):
    # Get number of frames to render for this case
    nframe = int(case.split(',')[4])
    
    # Render each frame in the simulation
    for ff in range(0,nframe):
        #===============================================
        #md_vis_fut.append(md_vis(
        #    inputs=[ff,
        #            Path("./models/c-ray"),
        #            Path("./results/case_"+str(ii))],
        #    outputs=[Path("./results/case_"+str(ii))]))
        
        #============================================================================
        # Use Path to stage in multiple files in ./models/c-ray.
        # Use File to stage in the single necessary input file, trj.out.
        # Use File to stage out the single output file, a single .ppm image.
        md_vis_fut.append(md_vis_2(
            inputs=[ff,
                    Path("./models/c-ray"),
                    Path("./results/case_"+str(ii)+"/trj.out")],
            outputs=[Path("./results/case_"+str(ii)+"/f_"+str(ff).zfill(3)+".ppm")]))
        
for vis in md_vis_fut:
    vis.result()
    
# Compile frames into movies
for ii, case in enumerate(cases_list):
    os.system("cd ./results/case_"+str(ii)+"; convert -delay 10 *.ppm mdlite.gif")

# Compile movies into Design Explorer results
os.system("./models/mexdex/postprocess.sh mdlite_dex.csv mdlite_dex.html ./")

# Step 5: Interact and clean up
This step is only necessary when running directly in a notebook. These intermediate and log files are removed to keep the workflow file structure clean if this workflow is pushed into the PW Market Place.  Please feel free to comment out these lines in order to inspect intermediate files as needed. The first two, `params.run` and `cases.list` are explicitly created by the workflow in Step 4.  The other files are generated automatically for logging, keeping track of workers, or starting up workers.

The outputs of this workflow are stored in the `results` folder and they can be interactively visualized with the Design Explorer by clicking on `mdlite_dex.html` which also uses `mdlite_dex.csv`. You can also embed output ![figures.](https://go.parallel.works/pwide/go-vpn-user-1.parallel.works/63001/files/download/?id=69d906f4-44e7-4765-bda7-9dfbee4261b8 "Molecules")

In [None]:
if (run_in_notebook):
    !rm -f params.run
    !rm -f cases.list
    !rm -rf runinfo
    !rm -rf __pycache__
    !rm -rf parsl-task.*
    !rm -rf *.pid
    !rm -rf *.started
    !rm -rf *.cancelled
    !rm -rf *.cogout
    !rm -rf lastid*
    !rm -rf launchcmd.*
    !rm -rf parsl-htex-worker.sh
    !rm -rf pw.conf
    # Move the outputs elsewhere.
    #!mkdir -p /pw/storage/mdlite-out
    #!mv ./results /pw/storage/mdlite-out
    #!mv mdlite_dex.* /pw/storage/mdlite-out
    !rm -rf ./results
    !rm -f mdlite_dex.*