# Airavata Experiment SDK - Molecular Dynamics Example

This SDK allows users to define, plan, and execute molecular dynamics experiments with ease.
Here we demonstrate how to authenticate, set up a NAMD experiment, add replicas, create an execution plan, and monitor the execution.

## Install the required packages

First, install the `airavata-python-sdk-test` package from the pip repository.

In [None]:
%pip uninstall -y airavata-python-sdk-test
%pip cache purge
%pip install -e airavata-api/airavata-client-sdks/airavata-python-sdk

## Import the Experiments SDK

In [2]:
%cd airavata-api/airavata-client-sdks/airavata-python-sdk/samples
import airavata_experiments as ae
from airavata_experiments import md

/Users/yasith/projects/artisan/airavata/airavata-api/airavata-client-sdks/airavata-python-sdk/samples


using legacy validation callback


## Authenticate for Remote Execution

To authenticate for remote execution, call the `ae.login()` method.
This method will prompt you to enter your credentials and authenticate your session.

In [None]:
ae.login()

Once authenticated, the `ae.list_runtimes()` function can be called to list HPC resources that the user can access.

In [None]:
runtimes = ae.list_runtimes()
display(runtimes)

## Upload Experiment Files

Drag and drop experiment files onto the workspace that this notebook is run on.

```bash
(sh) $: tree data
data
├── b4pull.pdb
├── b4pull.restart.coor
├── b4pull.restart.vel
├── b4pull.restart.xsc
├── par_all36_water.prm
├── par_all36m_prot.prm
├── pull.conf
├── structure.pdb
└── structure.psf

1 directory, 9 files

```

## Define a NAMD Experiment

The `md.NAMD.initialize()` is used to define a NAMD experiment.
Here, provide the paths to the `.conf` file, the `.pdb` file, the `.psf` file, any optional files you want to run NAMD on.
You can preview the function definition through auto-completion.

```python
def initialize(
    name: str,
    config_file: str,
    pdb_file: str,
    psf_file: str,
    ffp_files: list[str],
    other_files: list[str] = [],
    parallelism: Literal['CPU', 'GPU'] = "CPU",
    num_replicas: int = 1
) -> Experiment[ExperimentApp]
```

In [None]:
exp = md.NAMD.initialize(
    name="yasith_namd_experiment",
    config_file="data/pull.conf",
    pdb_file="data/structure.pdb",
    psf_file="data/structure.psf",
    ffp_files=[
      "data/par_all36_water.prm",
      "data/par_all36m_prot.prm"
    ],
    other_files=[
      "data/b4pull.pdb",
      "data/b4pull.restart.coor",
      "data/b4pull.restart.vel",
      "data/b4pull.restart.xsc",
    ],
    parallelism="GPU",
)

To add replica runs, simply call the `exp.add_replica()` function.
You can call the `add_replica()` function as many times as you want replicas.
Any optional resource constraint can be provided here.

In [None]:
exp.add_replica()

## Create Execution Plan

Call the `exp.plan()` function to transform the experiment definition + replicas into a stateful execution plan.
This plan can be exported in JSON format and imported back.

In [None]:
plan = exp.plan()  # this will create a plan for the experiment
plan.describe()  # this will describe the plan

## Execute the Plan

In [None]:
plan.save()  # this will save the plan in DB
plan.launch() # this will launch the plan
plan.save_json("plan.json") # this will save the plan locally

## Load and Describe the Launched Plan

In [None]:
assert plan.id is not None
plan = ae.plan.load(plan.id)
plan.describe()

## List all Plans the User Created

In [None]:
import pandas as pd
plans = ae.plan.query()
display(pd.DataFrame([plan.model_dump(include={"id"}) for plan in plans]))

Unnamed: 0,id
0,16781b12-fd99-496c-b815-c0fdc5889664
1,2a09f1c4-8a0a-46e4-bbdd-ffab13be3d5b
2,2a7896fa-5898-42f6-92b6-c053e4a702ba
3,2e206dab-ada7-45a6-a2ea-940adf9ef646
4,4fb8a73b-8333-4c73-8e74-5dd103f8a22f
5,5197d68c-63ec-4d13-bac5-24484e1d0ca6
6,54b9dcd6-a5e8-4a05-9690-aacd346de55c
7,562a195e-83f9-4de4-af5b-c43b4a2a40f6
8,768d97d5-233b-4450-a7e3-4df31f1fac3c
9,82814692-63fa-48e1-9e26-78b75269f513


## Check Plan Status

In [None]:
plan.status()

## Block Until Plan Completes

In [None]:
plan.join()

## Stop Plan Execution

In [None]:
plan.stop()

## Run File Operations on Plan

Displaying the status and files generated by each replica (task)

In [None]:
for task in plan.tasks:
    status = task.status()
    print(status)
    # task.upload("data/sample.txt")
    files = task.ls()
    display(files)
    display(task.cat("NAMD.stderr"))
    # task.download("NAMD.stdout", "./results")
    task.download("NAMD_Repl_1.out", "./results")

Displaying the intermediate results generated by each replica (task)

In [None]:
for index, task in enumerate(plan.tasks):

    @task.context(packages=["matplotlib", "pandas"])
    def analyze(x, y, index, num_tasks) -> None:
        from matplotlib import pyplot as plt
        import pandas as pd
        df = pd.read_csv("data.csv")
        plt.figure(figsize=(x, y))
        plt.plot(df["x"], df["y"], marker="o", linestyle="-", linewidth=2, markersize=6)
        plt.title(f"Plot for Replica {index} of {num_tasks}")

    analyze(3, 4, index+1, len(plan.tasks))