# WorkGraph example to run Geometry Optimisation and Descriptors

## Aim

This notebook gives an example on how we can run multiple calculations on a structure. For this example: we will load a structure, run geomopt, run descriptors and finally run a filter script to structures into train.xyz, test.xyz and valid.xyz.

### Setup

The initial setup is very similar to the other tutorials, such as `singlepoint.ipynb` which goes into more detail about what each step is doing.

Load the aiida profile and code:

In [None]:
from aiida import load_profile
load_profile()

In [None]:
from aiida_mlip.data.model import ModelData
uri = "https://github.com/stfc/janus-core/raw/main/tests/models/mace_mp_small.model"
model = ModelData.from_uri(uri, architecture="mace_mp", cache_dir="mlips")

In [None]:
from aiida.orm import load_code
janus_code = load_code("janus@localhost")

Inputs should include the model, code, metadata, and any other keyword arguments expected by the calculation we are running:

In [None]:
from aiida.orm import Str, Float, Bool
inputs = {
    "code": janus_code,
    "model": model,
    "arch": Str(model.architecture),
    "precision": Str("float64"),
    "device": Str("cpu"),
    "fmax": Float(0.1), 
    "opt_cell_lengths": Bool(False), 
    "opt_cell_fully": Bool(True), 
    "metadata": {"options": {"resources": {"num_machines": 1}}},
}

We now load the calculations we want to run:

In [None]:
from aiida.plugins import CalculationFactory

geomoptCalc = CalculationFactory("mlip.opt")
descriptorsCalc = CalculationFactory("mlip.descriptors")


Now we can create our WorkGraph. This includes passing in the inputs, looping through and running the calculations on each structure.

In the loop we can call each structure; 
* Run geomopt calculation and get the xyz_output data
* Pass the xyz_output data into and run descriptors calculation 
* Get the xyz_output of all the structures from descriptors and pass it to process_and_split_data
* process_and_split_data creates train.xyz, test.xyz and valid.xyz files and returns a dictionary with filepaths

In [None]:
from aiida.orm import Str, Float, Bool, Int
from ase.io import read
from aiida_workgraph import WorkGraph
from aiida.orm import StructureData
from sample_split import process_and_split_data

initail_structure = "structures/lj-traj.xyz"
num_structs = len(read(initail_structure, index=":"))

with WorkGraph("Calculation Workgraph") as wg:
    wg.inputs = inputs
    final_structures = {}

    for i in range(num_structs):
        structure = StructureData(ase=read(initail_structure, index=i))

        geomopt_calc = wg.add_task(
            geomoptCalc,
            code=wg.inputs.code,
            model=wg.inputs.model,
            arch=wg.inputs.arch,
            precision=wg.inputs.precision,
            device=wg.inputs.device,
            metadata=wg.inputs.metadata,
            fmax=wg.inputs.fmax,
            opt_cell_lengths=wg.inputs.opt_cell_lengths,
            opt_cell_fully=wg.inputs.opt_cell_fully,
            struct=structure,
        )

        descriptors_calc = wg.add_task(
            descriptorsCalc,
            code=wg.inputs.code,
            model=wg.inputs.model,
            arch=wg.inputs.arch,
            precision=wg.inputs.precision,
            device=wg.inputs.device,
            metadata=wg.inputs.metadata,
            struct=geomopt_calc.outputs.final_structure,
            calc_per_element=True,
        )

        final_structures[f"structs{i}"] = descriptors_calc.outputs.xyz_output

    wg.add_task(
        process_and_split_data,
        config_types= Str(""),
        n_samples=Int(num_structs),
        prefix= Str(""),
        scale= Float(1.0e5),
        append_mode= Bool(False),
        trajectory_data= final_structures
        )


Visualise the WorkGraph

In [None]:
wg


Run the tasks

In [None]:
wg.run()

We should get a dictionary with filepaths:

In [None]:
wg.tasks.process_and_split_data.outputs.result