# Geometry optimisation calculation

## Run geometry optimisation

To run a geometry optimisation using aiida-mlip you need to define some inputs as AiiDA data types, to then pass to the calculation.

To start, you will need a structure to optimise. 

The structure in the example is NaCl, generated using ase.build, or alternatively one can choose one of the structures in the folder `Structures`.
The input structure in aiida-mlip needs to be saved as a StructureData type.
We can print some properties of the structure, for example the cell or the atoms sites.

In [None]:
from aiida import load_profile
load_profile()

from aiida.orm import StructureData
from ase.io import read
from ase.build import bulk


#structure = StructureData(ase=read("structures/qmof-ffeef76.cif"))
structure = StructureData(ase=bulk("NaCl", "rocksalt", 5.63))

print(f"Initial cell parameters: {structure.cell}")
print(f"Structure's atoms sites: {structure.sites}")

Then we need to choose a model and architecture to be used for the calculation and save it as ModelData type, a specific data type of this plugin.
In this example we use mace_mp with a model that we download from this URL: "https://github.com/stfc/janus-core/raw/main/tests/models/mace_mp_small.model", and we save the file in the cache folder (default="~/.cache/mlips/"):


In [None]:
from aiida_mlip.data.model import ModelData
url = "https://github.com/stfc/janus-core/raw/main/tests/models/mace_mp_small.model"
model = ModelData.download(url, architecture="mace_mp", cache_dir="mlips")

If we already have the model saved in some folder we can save it as:

In [None]:
# from aiida_mlip.data.model import ModelData
# model = ModelData.local_file("mlips/mace_mp/mace_mp_small.model", architecture="mace_mp")

Another parameter that we need to define as AiiDA type is the code. Assuming the code is saved as `janus` in the `localhost` computer, the code info that are needed can be loaded as follow:


In [None]:
from aiida.orm import load_code
code = load_code("janus@localhost")

The other inputs can be set up as AiiDA Str, Float or Bool. There is a default for every input except the structure and code. This is a list of possible inputs:

In [None]:
from aiida.orm import Str, Float, Bool
inputs = {
        "code": code,
        "model": model,
        "struct": structure,
        "arch": Str(model.architecture),
        "precision": Str("float64"),
        "device": Str("cpu"),
        "fmax": Float(0.1), 
        "vectors_only": Bool(False), 
        "fully_opt": Bool(True), 
        "metadata": {"options": {"resources": {"num_machines": 1}}},
    }

It's worth noting that the architecture is already defined within the model, accessible through the architecture property in the ModelData. Even if not explicitly provided as input, it will be automatically retrieved from the model.

The calculation must be set:

In [None]:
from aiida.plugins import CalculationFactory
geomoptCalc = CalculationFactory("mlip.opt")

In this case, since we are running a geometry optimisation the entry point is `mlip.opt`
Finally, run the calculation:


In [None]:
from aiida.engine import run_get_node
result, node = run_get_node(geomoptCalc, **inputs)
print("CALCULATION FINISHED")

## Analyse results

`result` is a dictionary of the available results obtained from the calculation, while node contains the infor on the node where the calculation is run:


In [None]:
print(f"Printing output nodes dictionary: {result}")
print(" ")
print(f"Printing calcjob node info: {node}")


In this case there are more outputs than the single point calculation, such as the output structure and the trajectory of the optimisation.
We can see how many steps it took to optimise:

In [None]:
print(f"The number of optimisation steps is: {result['traj_output'].numsteps}")


The calculation can also be interacted with through verdi cli. Use `verdi process list` to show the list of calculations.


In [None]:
! verdi process list -a

The last calculation in the list is the last thing that was run. The PK number should correspond to the PK printed when you printed the node.
You can interact with the data through verdi commands, by writing that PK number of the calculation of interest.
Every calcjob node has a results dictionary which is outputted when running:


In [None]:
! verdi calcjob res PK

With the node show command we can see the inputs and outputs of the calculation.

In [None]:
! verdi node show PK

For the geometry optimisation we are most likely interested in the final structure and the trajectory of the geometry optimisation. Let's compare the initial and final cell parameters and see if they changed.

In [None]:
from aiida.orm import load_node

print(f"Initial cell parameters: {structure.cell}")
final_structure = load_node(PK)
print(f"Final cell parameters: {final_structure.cell}")

## Plot energies and visualise provenance graph

Now let's analyse the steps of the optimisation. We'll run singlepoint calculation on every steps to see how the energy changed. We'll also use this to visualise a complex provenance graph with more calculations connected.
Note the outputs of the calculation can be called either by using the load_node function when the Pk is known or directly with the outputs attribute of the calcjobe node.
Note also that we use the `calcfunction` decorator get the single structures from the `TrajectoryData`

In [None]:
from aiida.orm import load_node
from aiida.engine import calcfunction


#traj = node.outputs.traj_output
traj = load_node(PK) 
    
@calcfunction
def prepare_struct_inputs(traj, index):
    return traj.get_step_structure(index.value)

url = "https://github.com/stfc/janus-core/raw/main/tests/models/mace_mp_small.model"
model = ModelData.download(url, architecture="mace_mp", cache_dir="mlips")
list_of_nodes = []


inputs = {
    "code": code,
    "model": model,
    "precision": Str("float64"),
    "device": Str("cpu"),
    "metadata": {"options": {"resources": {"num_machines": 1}}},
}
    
for index in range(traj.numsteps):
    print(index)
    singlepointCalc = CalculationFactory("mlip.sp")
    struc = prepare_struct_inputs(traj, index)
    inputs['struct']=struc
    result, node = run_get_node(singlepointCalc, **inputs)
    list_of_nodes.append(node)
print("calculations ended")

Let's print the list of alcjob nodes that we just created

In [None]:
list_of_nodes

Now we can use it for getting the energies in every step and plotting them.
(a better alternative to the list of nodes might be to use a AiiDA group, see high-throughput-screening tutorial)

In [None]:
import matplotlib.pyplot as plt

steps = []
energies = []

# Loop through each node to extract step number and energy level
for step, node in enumerate(list_of_nodes):
    energy = node.outputs.results_dict.get_dict()['info']['mace_mp_energy']
    steps.append(step)
    energies.append(energy)

# Plotting the energy levels over steps
plt.figure(figsize=(10, 6))
plt.plot(steps, energies, marker='o', linestyle='-', color='g')
plt.title('Energy Levels over Steps')
plt.xlabel('Step')
plt.ylabel('Energy')
plt.show()


We can see that the energy decreased, which is what we want in a geometry optimisation process.

Now let's generate the provenance graph. (Insert PK number of the TrajectoryData in the code)

In [None]:
! verdi node graph generate PK

The provenance graph shows both the calculation that created the `TrajectoryData` and the calculations that we run using the structures in the `TrajectoryData`.
This is made possible by the use of the `calcfunction` decorator that we used. If we had not used it, the graph would stop a the `TrajectoryData` and the other `Singlepoint` calculations would be independent.