# Running alchemical free energy calculations

`ASAP-Alchemy` provides a set of automated workflows which when combined create and end-to-end pipeline enabling the routine running of state-of-the-art alchemical free energy calculations at (Alchemi)scale! In this tutorial we cover configuring and executing each of the workflows via the python API, but note `ASAP-Alchemy` also provides a CLI which can be used to execute any of the workflows and accepts custom configuration files providing the same amount of flexibility as the API but with easier execution! We also have tried and tested defaults in all of our workflows which are used in production so feel free to skip configuring each workflow unless you really need the extra flexibility.

## General design
Each of the workflows in `ASAP-Alchemy` are designed following the same factory pattern which allows them to have a similar API which should make them all feel familiar. Each workflow then begins as a resuable configuration object which defines the runtime options of that pipeline which can then be applied to sets of molecules.

# ASAP-Alchemy Prep
The first stage in the workflow is called `prep` and has two main jobs:

- **State Enumeration**: Enumerate the tautomers, protomers and stereoisomers of the input ligands

- **Constrained Pose Generation**: Generate initial poses for the lignads while constraining the ligand to match the crystal structure reference conformation.

We will now walk through the process building a standard `AlchemyPrepWorkflow` and assigning the required component parts:

In [None]:
from asapdiscovery.alchemy.schema.prep_workflow import AlchemyPrepWorkflow
from asapdiscovery.data.operators.state_expanders.protomer_expander import EpikExpander
from asapdiscovery.data.operators.state_expanders.stereo_expander import StereoExpander
from asapdiscovery.docking.schema.pose_generation import (
    OpenEyeConstrainedPoseGenerator,
    RDKitConstrainedPoseGenerator,
)

prep_workflow = AlchemyPrepWorkflow()

## Stereo expansion

Molecules with unknown stereo centers should be fully expanded before generating an initial pose to ensure we know the exact identity of the molecule we are making the prediction for, this also allows us to predict free energy differeces between steroisomers which might offer more insight to your medicinal chemistry team. 

So lets add our `openeye` stereo expander module to the workflow and set it to only expand any undefined stereo centers in the input molecules:

In [None]:
stereo_expander = StereoExpander(stereo_expand_defined=False)
prep_workflow.stereo_expander = stereo_expander

## Charge and Tautomeric expansion

Druglike molecules often have multipule accessible protonation and tautomeric states at experimental pH which can contribute to binding and considering only a single state in alchemical free energy calculations can introduce significiant error. One option is to use tools like OpenEye to try and predict the most reasonable form of a molecule at the experimental pH in question, another is to enumerate all possible forms and and use a state pentalty correction scheme based on the predicted pKa of each state. You can read more about this at <https://pubs.acs.org/doi/10.1021/acs.jctc.8b00826>.

`ASAP-Alchemy` prodvies both of these enumeration options (`EpikExpander`, `ProtomerExpander`, `TautomerExpander`) however they are under active development and so by default we skip this stage by setting it to `None`:

In [None]:
prep_workflow.charge_expander = None

## Pose Generator

We now need to select a backend which will be used to generate the initial poses for our molecules. We have two options available `OpenEyeConstrainedPoseGenerator` and the `RDKitConstrainedPoseGenerator`. Both implimentations use the same general workflow to generate the poses, that is: find the MCS overlap between the target ligand and some reference ligand (normally extracted from a crystal reference structure) and constrain the overlaping atoms of the target ligand to match the reference; then, for any atoms not constrained, enumerate the rotamers of any rotable bonds using the backend toolkit and filter down to a single favorable conformer for each target ligand in the series. 

By default we use the `RDKitConstrainedPoseGenerator` as follows:

In [None]:
pose_generator = RDKitConstrainedPoseGenerator(
    max_confs = 300,  # The maximum number of conformers to try and generate
    rms_thresh = 0.2, # The RMSD between the heavy atoms which should be used to filter duplicated conformers
    mcs_timeout = 1, # The maximum time in seconds to the MCS search for 
    clash_cutoff = 2.0,  # The distance cutoff for which we check for clashes between poses and the receptor in Angstroms
    selector = 'Chemgauss3',  # The method which should be used to select the best conformer, an openeye docking score in this case
    backup_score = 'Sage', # If the main scoring function fails the intramolecular energies calculated with this force field will be used to select the best conformer
)
prep_workflow.pose_generator = pose_generator

## Core Smarts 

In some cases you may want to define the core yourself, or you may not want to constrain the full MCS between a target molecule and the reference ligand, in these cases you can provide a core SMARTS pattern which will be used to define the MCS. While [SMARTS](https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html) patterns can be created by hand we recomend using ChemDraw or <https://smarts.plus/>. By default we let the pose generator find the MCS which can be done by setting the `core_smarts` field to `None`.

In [None]:
prep_workflow.core_smarts = None

## Stereochemistry filtering

During the pose generation process some molecules may end up with inconsistent stereochemistry, that is the stereochemistry we intended does not match the 3D geometry of the molecule. This often happens when our reference ligand contains a stereocenter and the target ligand we are trying to pose has opposing stereochemsitry, in these cases a sensible pose can not be generated and often requires manual modeling of the reference to allow opposing stereocenters to be generated. To ensure all molecules have the correct stereo chemistry we set the `strict_stereo` flag to `True`:

In [None]:
prep_workflow.strict_stereo = True

## Experimental references

In prospective predictions it is desirable to not only predict a ranking of the molecules in the series but to predict the absolute binding affinity so that the results might be compared accross alchemical networks and against other ligands with experimental data which are not included in the network. 

In practice this is done by including experimental reference compounds in the alchemical network and then shifting the final predictions by the mean of the experimantal absolute binding afinities. `ASAP-Alchemy` allows users to provide a list of ligands with experimental data as references which we then try to generate poses for. From this list, `ASAP-Alchemy` will extract `n` suitable ligands to include in your network. 

The number of references to include from that list can be controlled via the `n_references` field which by default is set to 3:

In [None]:
prep_workflow.n_references = 3

This completes the construction of the default `AlchemyPrepWorkflow`, and we can now view the settings of the workflow

In [None]:
prep_workflow

## Saving and Loading

At this point, configured workflows can be saved and loaded to `JSON` meaning that a workflow can be reused multipule times throughout a project to ensure that a consistent pipeline is applied to an entire series of alchemical free energy calculations in a given project.

In [None]:
prep_workflow.to_file(filename="My-prep-workflow.json")
prep_workflow_2 = AlchemyPrepWorkflow.from_file("My-prep-workflow.json")

## Running Alchemy prep

We are now ready to run our configured prep workflow and create an alchemy dataset which can be used in the next sections of the guide. Lets inspect the function to create the dataset and generate the missing parts:

In [None]:
prep_workflow.create_alchemy_dataset?

First grab a reference complex, in this example we will use a real ASAP-enabled target ([SARS-CoV-2 NSP3 macrodomain](https://asapdiscovery.org/outputs/molecules/#ASAP-SARS-COV-2-NSP3-MAC1)) and download a prepared complex from the asapdiscovery test suite which includes a prepared receptor and ligand.

In [None]:
from asapdiscovery.data.testing.test_resources import fetch_test_file
from asapdiscovery.data.schema.complex import PreppedComplex

mac1_complex = PreppedComplex.parse_file(fetch_test_file("constrained_conformer/complex.json"))
mac1_complex


We now need a list of target molecules for which we want to estimate the binding afinity via relative free energy calculations, in this case we'll define some very simple molecules as an example:

In [None]:
from asapdiscovery.data.schema.ligand import Ligand
from rdkit.Chem import Draw

molecules_smiles = [
    "CC(C)[C@H](Nc1ncnc2[nH]c(Cl)cc12)c4ccc3CCCS(=O)(=O)c3c4",
    "CC(C)[C@H](Nc1ncnc2[nH]c(F)cc12)c4ccc3CCCS(=O)(=O)c3c4",
    "CC(C)[C@H](Nc1ncnc2[nH]c(O)cc12)c4ccc3CCCS(=O)(=O)c3c4"
]
target_ligands = [
    Ligand.from_smiles(smiles=smiles, compound_name=f"ASAP-MAC1-{i}")
    for i, smiles in enumerate(molecules_smiles)
]
Draw.MolsToGridImage(mols=[mol.to_rdkit() for mol in target_ligands], legends=[mol.compound_name for mol in target_ligands])

Finally we can optionally pass in a set of reference ligands which can be added to the dataset to ensure the absolute predictions of the binding affinities are comparable with other experimental results. These ligands can be provided from any source so long as they are marked as experimental, we also provide tools to extract reference ligands from a [CDD vault](https://www.collaborativedrug.com/). Here we will make some example reference ligands:

In [None]:
reference_smiles = [
    "Cc4cc3c(N[C@H](c2ccc1CCCS(=O)(=O)c1c2)C(C)C)ncnc3[nH]4",
    "C[C@H](Nc1ncnc2[nH]ccc12)c4ccc3CCCS(=O)(=O)c3c4",
    "CC(C)[C@H](Nc1ncnc2[nH]c(N)cc12)c4ccc3CCCS(=O)(=O)c3c4"
]
reference_ligands = [
    Ligand.from_smiles(smiles=smiles, compound_name=f"ASAP-EXP-MAC1-{i}", experimental=True)
    for i, smiles in enumerate(reference_smiles)
]
Draw.MolsToGridImage(mols=[mol.to_rdkit() for mol in reference_ligands], legends=[mol.compound_name for mol in reference_ligands])

We can now run the workflow and generate poses for the molecules, for this example we will reduce the number of conformers which should be generated as the modifications to the ligands are rigid and the conformer generation step is slow. In production it is recommended to use around 300 with RDKit.

In [None]:
# reduce the number of conformers just for this example
prep_workflow.pose_generator.max_confs = 10
alchemy_dataset = prep_workflow.create_alchemy_dataset(
    dataset_name="my-first-asap-dataset",
    ligands=target_ligands,
    reference_complex=mac1_complex,
    reference_ligands=reference_ligands,
)

We now have an alchemy dataset object which contains information about the workflow we have just ran including the inputs and any errors that might cause poses to not be generated for some ligands. We can also inspect provenance information about the stages run including the versions of software. For example, let's look at the pose generator and the software versions used:

In [None]:
alchemy_dataset.pose_generator

In [None]:
alchemy_dataset.provenance["RDKitConstrainedPoseGenerator"]

We can also inspect the posed ligands and write them to file to view them:

In [None]:
alchemy_dataset.save_posed_ligands(filename='posed_ligands.sdf')
alchemy_dataset.posed_ligands[0]

We might want to view the ligands in the receptor and overlay the reference structure, so now we will write the reference receptor to a local PDB file and the ligand to an SDF file. You can then use your any molecule viewer (like PyMOL) or use the example below which uses `py3dmol`:

In [None]:
import py3Dmol
alchemy_dataset.reference_complex.target.to_pdb_file('mac1-receptor.pdb')
alchemy_dataset.reference_complex.ligand.to_sdf('mac1-ref-ligand.sdf')
view = py3Dmol.view(width=400, height=300)
view.addModel(open('mac1-receptor.pdb').read(), 'pdb')
view.setStyle({'chain':['A', 'B']}, {'cartoon': {'color': 'spectrum'}})
view.addModel(alchemy_dataset.reference_complex.ligand.to_sdf_str())
for mol in alchemy_dataset.posed_ligands:
    view.addModel(mol.to_sdf_str())
view.setStyle({'model': [i + 1 for i in range(len(alchemy_dataset.posed_ligands))]}, {"stick":{}})
view.zoomTo({'model':1})
view.show()

This dataset object can then be saved to file and acts as a form of provenance for the `prep` workflow and should contain all of the information required to reproduce the dataset should someone want to repeate your work, i.e. all settings, versioning, ligand and protein structures, et cetera.

In [None]:
alchemy_dataset.to_file("my-alchemy-dataset.json")

# ASAP-Alchemy Plan

We are now ready to plan an alchemical free energy network using a state-of-the-art workflow based on [OpenFE](https://docs.openfree.energy/en/stable/) infastructure. The free energy calculations can also be configured for local or distributed execution using [Alchemiscale](https://github.com/openforcefield/alchemiscale). Following the format in the `prep` pipeline we start with an `FreeEnergyCalculationFactory` which we will configure component by component before appling it to our `alchemy_dataset` preppared above. 

This workflow handles the following stages:
- **Network Planning**: Choosing the optimal transformations between ligands based on some atom mapping and scoring metrics

- **Protocol Settings**: Defining the run time settings of the resulting network ready for execution using OpenFE

In [None]:
from asapdiscovery.alchemy.schema.fec import FreeEnergyCalculationFactory
from asapdiscovery.alchemy.schema.network import (
    NetworkPlanner, 
    KartografAtomMapper, 
    LomapAtomMapper, 
    PersesAtomMapper, 
    RadialPlanner,
    MaximalPlanner, 
    MinimalRedundantPlanner, 
    MinimalSpanningPlanner
)
alchemy_factory = FreeEnergyCalculationFactory()

## Network Planner

The first component of the `FreeEnergyCalculationFactory` is the the `NetworkPlanner` module which configures how the optimal transformations should be selected using OpenFE tooling and can be constructed via:

In [None]:
network_planner = NetworkPlanner()

### Atom Mapping Engine

Our free energy calculations by default use a hybrid topology approach and so an atom mapping which identifies the atoms of ligand A which should be alchemicaly transformed to atoms of ligand B is needed. Atoms that are not mapped should be consistend between ligands A/B. We currently support all of the available OpenFE atom mappers (LomapAtomMapper, PersesAtomMapper & KartografAtomMapper) but use Lomap by default which can be constructed as follows:

In [None]:
atom_mapper = LomapAtomMapper(
    timeout = 20, # The timeout in seconds of the MCS algorithm in rdkit
    threed = True, # If spatial information should be used to choose between symmetrically equivalent mappings 
    max3d = 1000, # Maximum discrepancy in Angstroms between atoms before the mapping is not allowed
    element_change = True, # Whether to allow element changes in the mappings
    seed = '', # An optional seed SMARTS string to speed up the MCS, if left blank this is automatically generated
    shift = True, # When determining 3D overlap translate the molecules to minimse the RMSD during alignment
)
network_planner.atom_mapping_engine = atom_mapper

### Transformation Scorer

Given the proposed atom mapping (which is generated between every possible combination of ligands in the target set) we need to choose the best edges to run during this campaign. To this end, OpenFE provides scoring metrics which rank the proposed atom mappings; our default is to use the lomap scorer:

In [None]:
network_planner.scorer = "default_lomap"

### Network Planning

Once we have all of the possible edges scored we then need to pick a strategy to build the network which will determine how many connections it has. The optimal network should provide a balance between speed (not too many edges), accuracy and redundancy. OpenFE provides many basic planning methods (`RadialPlanner`, `MaximalPlanner`, `MinimalSpanningPlanner`, `MinimalRedundantPlanner`) with more under active development. For now, our default is to use the `MinimalRedundantPlanner` which builds a minimal spanning tree ensuring all ligands are connected to the network but also adds `n` extra redundant edge(s) per node which ensures each node is in at least one cycle if `n`=1. `n` can be controlled by the user:

In [None]:
planning_method = MinimalRedundantPlanner(
    redundancy = 2
)
network_planner.network_planning_method = planning_method
network_planner

The planner can then be saved to file like all `ASAP-Alchemy` workflows and reused throughout a discovery project or combined into our `FreeEnergyCalculationFactory`:

In [None]:
network_planner.to_file(filename='my-network-planner.json')
alchemy_factory.network_planner = network_planner

## Alchemy Protocol

We can now define the extensive set of runtime settings which will be used in the alchemical free energy calculations starting with the OpenFE protocol, so far only one is supported which is the `RelativeHybridTopologyProtocol` and all of the settings are directly related to this protocol. We plan on adding support for other types of calculations in the future so check back soon!

In [None]:
alchemy_factory.protocol = 'RelativeHybridTopologyProtocol'

### Solvent Settings

To accurately represent the experimental conditions our simulations will be performed in explicit solvent, and we begin by defining the settings of the solvent:

In [None]:
from asapdiscovery.alchemy.schema.fec import SolventSettings
from openff.units import unit
solvent = SolventSettings(
    smiles = "O", # The smiles pattern of the solvent
    positive_ion = "Na+", # The positive monoatomic ion which should be used to neutralize the system
    negative_ion = "Cl-", # The negative monoatomic ion which should be used to neutralize the system
    neutralize = True, # If we should add ions to neutralize the total charge of the system
    ion_concentration = 0.15 * unit.molar, # The ionic concentration required in molar units
)
alchemy_factory.solvent_settings = solvent

### Force Field Settings

The protocol also gives us control over the force fields used to parameterize all of the components of the system including the small molecule force field which allows for bespoke parameters (which we will be covering in a different tutorial). For now, we use the standard default settings but ensure that we always use the most recent OpenFF force field which at the time of writing is openff-2.1.0:

In [None]:
from gufe import settings
ff_settings = settings.OpenMMSystemGeneratorFFSettings(
    small_molecule_forcefield="openff-2.1.0"
)
alchemy_factory.forcefield_settings = ff_settings

### Thermodynamic Settings 

We can explicitly set the temperature and pressure of the simulation to ensure we match the experimental conditions:

In [None]:
thermo_settings = settings.ThermoSettings(
    temperature = 298.15 * unit.kelvin,
    pressure = 1 * unit.bar
)
alchemy_factory.thermo_settings = thermo_settings

<div class="alert alert-block alert-warning">
<b>Warning:</b> The next few sections offer advanced control over the simulation settings specific to OpenMM and are normally best left to their defaults!
</div>

### OpenMM System Settings

This allows to change the non-bonded settings in our simulation such as the method used to calculate the long-range charge interactions and the cutoff for short range non-bonded interactions.



In [None]:
from openfe.protocols.openmm_rfe.equil_rfe_settings import (
    AlchemicalSamplerSettings,
    AlchemicalSettings,
    IntegratorSettings,
    OpenMMEngineSettings,
    SimulationSettings,
    SolvationSettings,
    SystemSettings,
)
alchemy_factory.system_settings = SystemSettings()

### Solvation Settings

Not to be confused with solvent settings, the solvation settings control how the solvent will be added to the system, i.e. the water model used (3, 4 or 5 point water) and how much solvent should be added to ensure a minimum distance between the solutes and edge of the surrounding solvent.

In [None]:
alchemy_factory.solvation_settings = SolvationSettings()

### Alchemical Settings

This controls the used lambda schedule and the creation of the hybrid system:

In [None]:
alchemy_factory.alchemical_settings = AlchemicalSettings()

### Alchemical Sampler Settings

This defines the equilibrium sampler (ReplicaExchangeSampler, SAMSSampler or MultistateSampler) to use and its run time settings, currently we use `repex` (ReplicaExchangeSampler) by default. The only change we make here is to set the number of repeats to one. This means that each edge is only executed once in a single job and we instead do repeats by doing the calculation multiple times in parallel accross different GPUs via alchemiscale, see the `n_repeats` field on the `FreeEnergyCalculationFactory` class.

In [None]:
alchemy_factory.alchemical_sampler_settings = AlchemicalSamplerSettings(
    n_repeats = 1
)

### Engine Settings

OpenMM specific settings are defined here like the compute platform to use (CPU/GPU), by default we leave this as `None` which allows OpenMM to select the fastest available for us on the hardware that the simulations are run on:

In [None]:
alchemy_factory.engine_settings = OpenMMEngineSettings()

### Integrator Settings

We can also configure the settings used to build the `LangevinSplittingDynamicsMove` integrator in OpenMM such as the timestep or collison rate:

In [None]:
alchemy_factory.integrator_settings = IntegratorSettings()

### Simulation Settings

General settings about the simulation length are defined here:

In [None]:
simulation_settings = SimulationSettings(
    equilibration_length=1.0 * unit.nanoseconds,
    production_length=5.0 * unit.nanoseconds,
)
alchemy_factory.simulation_settings = simulation_settings

### Repeats

To ensure our estimation of the relative free energy is accurate we repeat each calculation and take the average prediction. We also use the standard deviation across repeats as an estimate of error. In the future this might also allow us to detect possible bad edges due to repeates differing significantly and remove these unreliable edges, although this functionality has not yet been built into `ASAP-Alchemy`. By default we do two repeats of each edge meaning it is run for a total of two times as different jobs on alchemiscale to increase throughput:

In [None]:
alchemy_factory.n_repeats = 2  

This then completes our `FreeEnergyCalculationFactory` which can now be saved to file and reused over the course of a campaign:

In [None]:
alchemy_factory.to_file("my-alchemy-factory.json")
alchemy_factory

## The FreeEnergyCalculationNetwork

We are now ready to create a `FreeEnergyCalculationNetwork` by applying our `alchemy_factory` to our `alchemy_dataset`. First, let's inspect the function and workout how to feed in our dataset:

In [None]:
alchemy_factory.create_fec_dataset?

The function requires us to create a `gufe` `ProteinComponent` for the receptor and also has some optional fields which have not been covered yet. 

#### Optional Fields

- `central_ligand`: Only needed if we are using a `RadialPlanner` as our network planner method in which case you should provide the ligand and ensure its not passed in the ligands list as well. 

- `experimental_protocol`: Used to associate a CDD vault experimental protocol with the dataset which allows automated retrieval of experimental data (potencies) but in general is a useful source of provenance to know what experimental data the network should be compared against.

- `target`: The biological target for which the alchemical free energy network is being run for. Again, a useful source of provenance.

In [None]:
from gufe.components.proteincomponent import ProteinComponent

gufe_receptor = ProteinComponent.from_pdb_file('mac1-receptor.pdb')
fec_network = alchemy_factory.create_fec_dataset(
    dataset_name = alchemy_dataset.dataset_name,
    receptor = gufe_receptor,
    ligands = alchemy_dataset.posed_ligands,
    experimental_protocol = 'MAC1-protocol',
    target = "SARS-CoV-2-MAC1"
)

We now have a `FreeEnergyCalculationNetwork` which contains our planned network and all of the runtime settings needed to compute the edges using OpenFE. First lets inspect the network we just generated, you can see that during the `prep` workflow the settings and software versions used are saved to ensure the workflow is reproducible:

In [None]:
fec_network.network.dict(include={"atom_mapping_engine", "scorer", "network_planning_method", "provenance"})

We can also view the planned network using some OpenFE tools, as we can see each of the ligands has at least two connections in the graph corresponding to the level of redundancy we requested in the network planning method:

In [None]:
from openfe.utils.atommapping_network_plotting import plot_atommapping_network
%matplotlib inline
plot_atommapping_network(fec_network.network.to_ligand_network())

With the `OpenFE` CLI you can also view the `graphml` file for this network by running `openfe view-ligand-network <file>` which should pop up an interactive GUI view of the network for you to inspect the molecule structures and even the atom mapping per edge.

We can also inspect the individual edges of the network and the atom mappings generated which describe how the atoms will be transformed during a simulation:

In [None]:
mapping = next(iter(fec_network.network.to_ligand_network().edges))
mapping

We can now save this network to file to execute later:

In [None]:
fec_network.to_file('fec-network.json')

# ASAP-Alchemy Submit

We are now ready to execute our alchemical network and estimate the relative binding affinity of the ligands. Currently there are two ways to do this locally with OpenFE or on distributed compute via Alchemiscale. At ASAP we use Alchemiscale exclusively allowing us to manage thousands of simultaneous calculations across many networks across many supercomputers, however it is very easy to also execute the calculations locally. We have an interface to simply convert the network to the OpenFE `AlchemicalNetwork` object and use either the CLI or API provided by OpenFE to execute the tasks see the [tutorials](https://docs.openfree.energy/en/stable/tutorials/index.html) for more information:

In [None]:
openfe_network = fec_network.to_alchemical_network()
openfe_network

We provide an Alchemiscale helper within `ASAP-Alchemy` to make it easier when working with our `FreeEnergyCalculationNetworks`s, this can be used to submit, execute, restart and gather results from an alchemiscale instance and is just a wrapper around the alchemiscale client. The class does require that your login `identifier` and `key` are set as the environment variables `ALCHEMISCALE_ID` and `ALCHEMISCALE_KEY` respectively:

In [None]:
import os
from asapdiscovery.alchemy.utils import AlchemiscaleHelper
from alchemiscale import Scope

os.environ["ALCHEMISCALE_ID"] = 'my-id'
os.environ["ALCHEMISCALE_KEY"] = 'my-key'
helper = AlchemiscaleHelper()

We can now create the network on the Alchemiscale instance using the helper which once complete will return a new copy of our `FreeEnergyCalculationNetwork` object with a results field setup and a network key which can be used to identify the network on Alchemiscale:

In [None]:
submitted_network = helper.create_network(planned_network=fec_network, scope=Scope(org="MYORG", campaign="SARS-CoV-2-MAC1", project="mac1-testing"))

Now we need to create tasks for each of the transformations defined in our network and action them which queues them for execution. These two stages are handled by one convince method on the helper called `action_network`. Note that we now used the `submitted_network` version of our network as it contains the `network_key` used to find the network on Alchemiscale:

In [None]:
task_keys = helper.action_network(planned_network=submitted_network)

Once all of the calculations have finished or once the results are needed the results can be gathered from alchemiscale using the `collect_results` helper function which returns a new copy of the `FreeEnergyCalculationNetwork` with results for the completed edges:

In [None]:
network_with_results = helper.collect_results(planned_network=submited_network)

# ASAP-Alchemy Predict

Once we have a successful set of transformations we can estimate the absolute binding affinity of our ligands by combining the relative measures via the maximum likelihood estimator method implemented in cinnabar, a best practices method for reporting the results of free energy calculations. ASAP-Alchemy has an interface to Cinnabar to make it easy to turn the simulation results into useful information for the med chem team. For this example we will use the TYK2 network which has been curated as part of the [protein-ligand benchmark](https://github.com/OpenFreeEnergy/protein-ligand-benchmark) dataset, we have the results of a subsection of this network in our testing suite.

In [None]:
from asapdiscovery.alchemy.schema.fec import FreeEnergyCalculationNetwork
tyk2_network = FreeEnergyCalculationNetwork.from_file(fetch_test_file('tyk2_result_network.json'))
tyk2_network.results

We can see that the `tyk2_network` now has a results field which has references to the network on the alchemiscale instance and a set of `Transformationresults` which contain the results of each simulated edge of the network. As the OpenFE `RelativeHybridTopologyProtocol` simulates the `complex` and `solvent` phases separately we need to combine these to estimate the relative free energy between the ligands in the transformation. We can do this by using the interface with `cinnabar`:

In [None]:
measures = tyk2_network.results.to_cinnabar_measurements()
measures

We also convert directly to a `FEMap` object and use it to calculate the absolute binding affinity estimates, first let's draw a graph of the converted network to check we get the expected `9 edges` and `10 ligands`:

In [None]:
fe_map = tyk2_network.results.to_fe_map()
fe_map.draw_graph()

Now we can check that the calculated relative affinities have been correctly converted and we can view them as a table:

In [None]:
fe_map.generate_absolute_values()
fe_map.get_relative_dataframe()

We can now view the estimated absolute binding affinities for these molecules as a pandas table which we can use to rank our compounds and provide feed back to the med chem team:

In [None]:
fe_map.get_absolute_dataframe()

Hold on those absolute affinities look a little off! You will notice that they are centred around `0` as we have no experimental reference or absolute affinity prediction to centre the results around. This is one of the reasons why we would inject experimentally measured ligands into the network during the prep stage described above. Luckily for this example, we have experimental data for all of the ligands, let's use cinnabar to assess the accuracy of our predictions vs experiment:

In [None]:
import pandas as pd
tyk2_reference_data = pd.read_csv(fetch_test_file('tyk2_reference_data.csv'))
for _, row in tyk2_reference_data.iterrows():
    fe_map.add_experimental_measurement(
        label=row['Molecule Name'],
        value=row['IC50_GMean (µM)'] * unit.micromolar,
        uncertainty=0 * unit.molar 
    )
fe_map.generate_absolute_values()

We can now plot the estimated relative binding affinities vs experiment:

In [None]:
from cinnabar.plotting import plot_DDGs, plot_DGs
plot_DDGs(fe_map.to_legacy_graph())

Finally we can plot the estimated absolute binding afinities vs experiment:

In [None]:
plot_DGs(fe_map.to_legacy_graph())

Before using the predictions in production it is common to benchmark the system under study to asses its suitability for free energy calculations, this normally involves curating a set of similar ligands which have reliable experimental measures of affinity and estimating their affinity using the production protocol. To help with debugging these benchmarks we have created some tools to produce interactive graphs to make it easier to identify outliers. Here we will generate interactive equivalents of the cinnabar graphs using `ASAP-Alchemy`. First, we use a utility function which extracts the absolute and relative predictions from the cinnabar `FEMap` and inserts experimental data extracted from a formatted csv file. Note that the absolute predictions are also automatically shifted to match the experimental range of binding affinity values:

In [None]:
from asapdiscovery.alchemy.predict import get_data_from_femap, create_absolute_report, create_relative_report
fe_map = tyk2_network.results.to_fe_map()
fe_map.generate_absolute_values()
absolute_df, relative_df = get_data_from_femap(
    fe_map=fe_map,
    ligands=tyk2_network.network.ligands,
    assay_units='IC50',
    reference_dataset=fetch_test_file('tyk2_reference_data.csv')
)

Now we can produce an interactive relative report to inspect each transformation and identify outliers:

In [None]:
relative_layout = create_relative_report(dataframe=relative_df)
relative_layout.embed()

The same can also be done for the absolute report:

In [None]:
absolute_layout = create_absolute_report(dataframe=absolute_df)
absolute_layout.embed()

Both of these reports can also be saved to file and shared with others using the `save` function which will create an interactive `html` file:

In [None]:
absolute_layout.save(filename='tyk2-absolute.html', title='tyk2-benchmark', embed=True)