# Linking Workflows together for automated substituent parameter generation

While the `SubstituentParameterWorkchain` is extremely versatile for calculating individual substituent properties, performing these for a large group is non-trivial. In the author's use case, AIMAll software is not available on compute clusters and are run on local desktop. To not overload the local computer controllers can be used to limit number of active processes. Here, through linking various controllers together, the full `SubstituentParameterWorkchain` can be simulated.

The controllers are implementations of the `FromGroupSubmissionController` from `aiida-submission-controller`. When instantiating a controller, you provide the `parent_group` that it scans for inputs. The controllers look for the unique extras values "smiles", which can be set as shown in the code block below. All nodes in the `parent_group` must have unique values of the "smiles". **NO NODES CAN HAVE DUPLICATE SMILES OR THE CONTROLLERS WILL NOT WORK**.

```python
node.base.extras.set("smiles","some_smiles")
```

From providing this, and setting up the correct groups to scan from, the controllers are then set to link together by putting the relevant nodes in the group for the next controller in the protocol, and setting its extras. The code below shows how the controllers can be linked for substituent parameter calculation. We set low max_concurrent for the AIM controllers to not overload the local computer.

In [2]:
from aiida import orm
from aiida.orm import Dict, load_group
from aiida_aimall.controllers import SmilesToGaussianController, AimAllSubmissionController, AimReorSubmissionController, GaussianSubmissionController
from aiida.plugins import DataFactory
from aiida import load_profile
from aiida.engine.processes.control import play_processes
import datetime
import time
load_profile()
AimqbParameters = DataFactory('aimall.aimqb')
aim_params = parameter_dict={"naat": 2, "nproc": 2, "atlaprhocps": True}
smile_controller = SmilesToGaussianController(
    parent_group_label = 'smiles',
    group_label = 'g16_opt',
    code_label='gaussian@cedar',
    g16_opt_params=orm.Dict(dict={
        'link0_parameters': {
            '%chk':'aiida.chk',
            "%mem": "2000MB", # Currently set to use 8000 MB in .sh files
            "%nprocshared": 4,
        },
        'functional':'wb97xd',
        'basis_set':'aug-cc-pvtz',
        'route_parameters': {'opt': None, 'freq':None,'Output':'WFX'},
        "input_parameters": {"output.wfx\n": None, "output2.wfx":None},
    }),
    wfxgroup = "opt_wfx",
    nprocs = 4,
    mem_mb = 3200,
    time_s = 60*60*24*7,
    max_concurrent = 100
)
aimreor_controller = AimReorSubmissionController(
    parent_group_label = 'opt_wfx',
    group_label = 'opt_aim',
    max_concurrent = 1,
    code_label='aimall@localhost',
    reor_group = 'reor_structs',
    aimparameters=aim_params
)
gaussian_controller = GaussianSubmissionController(
    parent_group_label = 'reor_structs',
    group_label = 'gaussian_sp',
    max_concurrent = 100,
    code_label='gaussian@cedar',
    g16_sp_params=Dict(dict={
        'link0_parameters': {
            '%chk':'aiida.chk',
            "%mem": "2000MB",
            "%nprocshared": 4,
        },
        'functional':'wb97xd',
        'basis_set':'aug-cc-pvtz',
        'charge': 0,
        'multiplicity': 1,
        'route_parameters': {'nosymmetry':None, 'Output':'WFX'},
        "input_parameters": {"output.wfx": None},
        
    }),
    wfxgroup='reor_wfx'
)
aimall_controller = AimAllSubmissionController(
    code_label='aimall@localhost',
    parent_group_label = 'reor_wfx',
    group_label = 'sp_aim',
    max_concurrent = 1,
    aimparameters=aim_params,
    aim_parser='aimall.group'
)



  self._group, _ = orm.Group.objects.get_or_create(self.group_label)
  self._parent_group = orm.Group.objects.get(
  self._group, _ = orm.Group.objects.get_or_create(self.group_label)
  self._parent_group = orm.Group.objects.get(
  self._group, _ = orm.Group.objects.get_or_create(self.group_label)
  self._parent_group = orm.Group.objects.get(
  self._group, _ = orm.Group.objects.get_or_create(self.group_label)
  self._parent_group = orm.Group.objects.get(


Note in the definition of the controllers that we can see the links between the controllers implied. For example, `AIMReorSubmissionController` has an input `reor_group`. This input is the label of the group that the generated structures are stored in by the workchain. We can also see that `GaussianSubmissionController` has an input `parent_group_label="reor_structs"`. So, in this case, the `AIMReorSubmissionController` generates structures into a group, which `GaussianSubmissionController` then checks for its inputs.

These controllers are then run in a `while` loop, either in a Jupyter notebook or a script

```python
def prune_group(group_label):
    """Removes earlier instances of a given group from the group to ensure unique extras."""
    group = load_group(group_label)
    smiles={}
    for node in group.nodes:
        smi = node.extras['smiles']     
        if smi in smiles:
            pk_to_del = min(smiles[smi].pk,node.pk)
            group.remove_nodes(orm.load_node(pk_to_del))
        else:
            smiles[smi] = node

total_jobs = smile_controller.num_to_run + smile_controller.num_already_run
while aimall_controller.num_already_run < total_jobs:
    play_processes(all_entries=True)
    # prune_group('smiles')
    smile_controller.submit_new_batch()
    # this will loop 10 times, checking for finished AIM calculations to submit every 3 minutes.
    # After the set of 10 loops, the outer loop happens again, checking the Gaussian Optimization calculations
    # Due to Gaussian jobs longer run time, we elected to check every 30 minutes. 
    # You can adjust the loops to suit your needs
    for _ in range(10):
        prune_group('opt_wfx')
        prune_group('reor_structs')
        prune_group('reor_wfx')
        try:
            gaussian_controller.submit_new_batch()
            
        except:
            print(f"Skipping Gaussian this loop at {datetime.now()} ")
        try:
            aimall_controller.submit_new_batch()
            
        except:
            print(f"Skipping AIMAll this loop at {datetime.now()} ")
        try:
            aimreor_controller.submit_new_batch()
            
        except:
            print(f"Skipping AIMReor this loop at {datetime.now()} ")
        time.sleep(180)
```