## Constrained docking protocol

In this tutorial, we will demonstrate how you can use `rush-py` to conduct a large-scale virtual screen on a target using a constrained docking protocol.

We will use the Zinc20 database of FDA approved drugs as our sample ligand database, but Rush's capability means that this protocol could scale to screen tens of millions of ligands.

## 0.0) Imports

In [1]:
import requests
import csv
import shutil
import json
import numpy as np
from pathlib import Path
from enum import Enum

from rdkit.Chem import MolFromSmiles, MolToSmiles
from rdkit.Chem import AllChem, SDMolSupplier
from rdkit.Chem import rdFMCS, rdRascalMCES
from rdkit.Chem import rdDistGeom
from rdkit import Chem
from rdkit.Chem.rdMolAlign import AlignMol
from rdkit.Chem import rdForceFieldHelpers

from typing import List, Optional

## 1.1) Configuration

In [2]:
# |hide
import os
import pathlib

WORK_DIR = pathlib.Path("~/qdx/constrained-docking/").expanduser()
if WORK_DIR.exists():
    !rm -r $WORK_DIR
os.makedirs(WORK_DIR)
os.chdir(WORK_DIR)

In [3]:
# Define our project information
DESCRIPTION = "rush-py constrained docking protocol"
TAGS = ["qdx", "rush-py-v2", "demo", "contrained-docking"]
WORK_DIR = Path.home() / "qdx" / "constrained-docking"

In [4]:
import rush
import asyncio

client = await rush.build_provider_with_functions()

## 0.1) Virtual screen configuration
The expectation for using this virtual screen is that you will have a protein with a characterized binding ligand and you will want to try to dock a set of alternate targets to the protein.
You should save files to the same directory that you run this notebook in.

The key configuration items are:

`REFERENCE_LIGAND_FILEPATH`: the path to a SDF file containing a known binder for your protein target.

`PROTEIN_TARGET_FILEPATH`: the path to a PDB file containing your protein target. Please note that the binder must be removed. 

`VIRTUAL_SCREEN_LIBRARY_URL`: a URL to a Zinc20 csv virtual screen database download.

`SORT_BY`: sort the GNINA minimised poses output by a particular score. For more information on GNINA outputs, see [here](https://github.com/gnina/gnina).


## 0.2) 
For this example, we will fetch an example protein and ligand from RCSB. We are using CDK with the JWS648 inhibitor as a known ligand to serve as our template.

In [5]:
class SortKey(Enum):
    CNN_SCORE = "cnn_score"
    AFFINITY = "affinity"
    CNN_AFFINITY = "cnn_affinity"


REFERENCE_LIGAND_FILEPATH = Path.cwd() / "3pxy_B_JWS.sdf"
PROTEIN_TARGET_FILEPATH = Path.cwd() / "3pxy_cleaned.pdb"
VIRTUAL_SCREEN_LIBRARY_URL = (
    "https://zinc20.docking.org/substances/subsets/fda.csv?count=all"
)
SORT_BY = SortKey.CNN_SCORE

OBJECT_POSE_FILEPATH = Path.cwd() / "objects"

In [6]:
!pdb_fetch '3pxy' |  pdb_delhetatm > 3pxy_cleaned.pdb
REFERENCE_LIGAND_FILEPATH.write_bytes(
    requests.get(
        "https://models.rcsb.org/v1/3pxy/ligand?auth_seq_id=299&label_asym_id=B&encoding=sdf&filename=3pxy_B_JWS.sdf"
    ).content
)
!ls

3pxy_B_JWS.sdf	3pxy_cleaned.pdb


## 0.1) Constrained docking code
The below block of code is the set of helpers functions necessary for performing constrained docking as part of a large-scale virtual screen.

In [7]:
MCS_FAILURES = []
ALIGNMENT_FAILURES = []


def get_mcs(
    query_ligand: Chem.Mol, reference: Chem.Mol, timeout=20, **kwargs
) -> Optional[Chem.Mol]:
    if kwargs.get("ignore_heavy_atom"):
        atom_comparision_method = rdFMCS.AtomCompare.CompareAnyHeavyAtom
    else:
        atom_comparision_method = rdFMCS.AtomCompare.CompareElements

    if kwargs.get("ignore_bond_order"):
        bond_comparision_method = rdFMCS.BondCompare.CompareAny
    else:
        bond_comparision_method = rdFMCS.BondCompare.CompareOrder

    if kwargs.get("use_rascal_mces"):
        mcs = rdRascalMCES.FindMCES(
            [reference, query_ligand],
            atomCompare=atom_comparision_method,
            bondCompare=bond_comparision_method,
        )

    else:
        mcs = rdFMCS.FindMCS(
            [reference, query_ligand],
            threshold=0.9,
            completeRingsOnly=kwargs.get("complete_rings_only", True),
            atomCompare=atom_comparision_method,
            bondCompare=bond_comparision_method,
            timeout=timeout,
        )
    if meets_similarity_threshold(mcs, query_ligand, reference):
        return Chem.MolFromSmarts(mcs.smartsString, mergeHs=True)
    return None


def meets_similarity_threshold(
    mcs, query_ligand: Chem.Mol, reference: Chem.Mol, min_threshold=0.20
) -> bool:
    if mcs_result_exists(mcs, query_ligand):
        mcs_mol = Chem.MolFromSmarts(mcs.smartsString, mergeHs=True)
        match_ratio = min(
            mcs_mol.GetNumAtoms() / query_ligand.GetNumAtoms(),
            mcs_mol.GetNumAtoms() / reference.GetNumAtoms(),
        )
        return match_ratio >= min_threshold

    return False


INITIAL_FILTER_DISTANCE = 1000


def get_diverse_substructure_matches(
    reference, mcs_mol, minimum_difference=5
) -> List[Chem.Mol]:
    """
    Prunes a list of MCS substructure hits within a molecule and keeps only those which are at least the minimum difference apart
    This is to capture as many unique substructure hits across the query and the ligand, but discard those that are not meaningfully different (e.g. a rotation of the bond)
    A default of 5 is an effective balance between retaining novel hits and discarding excessively similar values.
    """
    substructures = reference.GetSubstructMatches(mcs_mol, uniquify=False)

    output_structures = [substructures[0]]
    for substructure in substructures[1:]:
        distance = INITIAL_FILTER_DISTANCE
        j = 0

        while (distance >= minimum_difference) and j < len(output_structures):
            ref = np.array(output_structures[j])
            distance = sum(np.array(substructure) != ref)
            j += 1

        if distance >= minimum_difference:
            output_structures.append(substructure)

    return output_structures


def get_tethered_atoms(substruct_match) -> str:
    """
    Return a formatted string of atom indexes to pass to "TETHERED ATOMS" configuration option for rxdock

    Example input:
    (0, 1, 2)

    Example output:
    1,2,3
    """
    return ",".join(str(index + 1) for index in substruct_match)


def mcs_result_exists(mcs, query) -> bool:
    return mcs.smartsString and len(mcs.smartsString) > 0


def get_force_field(geom_calc, query_mol):
    ff = rdForceFieldHelpers.UFFGetMoleculeForceField(query_mol, confId=0)

    for i in geom_calc.coordMap:
        point = geom_calc.coordMap[i]
        point_idx = ff.AddExtraPoint(point.x, point.y, point.z, fixed=True) - 1
        ff.AddDistanceConstraint(point_idx, i, 0, 0, 100.0)
    ff.Initialize()

    return ff


def get_template_aligned_pose(
    query, reference, query_ligand_map, geom_calc, n_tries=5
) -> Chem.Mol:
    """
    Align query molecule to "template" (reference molecule)
    Distance constrints are enforced on the input ligand to mould the geometry of its matching substructure to that in the template
    """
    temp_query_mol = Chem.Mol(query)

    min_energy = np.inf
    bestmol = None

    for _ in range(n_tries):
        # Repeat the alignment step at each starting configuration
        # Places internal geometry (bond angles, torsion, etc) of parts of the ligand
        # that matches that in the reference
        # Need to error here if ci > 0
        AllChem.EmbedMolecule(temp_query_mol, geom_calc)

        # This step is critical as it places the molecule in a good starting position relative to the reference
        try:
            AlignMol(temp_query_mol, reference, atomMap=query_ligand_map)
        except:
            print(f"Failed to align {MolToSmiles(query)}")
            ALIGNMENT_FAILURES.append(temp_query_mol)

        try:
            ff = get_force_field(geom_calc, temp_query_mol)
        except:
            print(f"Failed to get forcefield for {MolToSmiles(temp_query_mol)}")
            return bestmol

        minimize_tries = 4
        more_to_minimize = ff.Minimize(energyTol=1e-4, forceTol=1e-3)
        while more_to_minimize and minimize_tries:
            more_to_minimize = ff.Minimize(energyTol=1e-4, forceTol=1e-3)
            minimize_tries -= 1
        current_energy = ff.CalcEnergy()

        if current_energy < min_energy:
            bestmol = Chem.Mol(temp_query_mol)
            min_energy = current_energy
        temp_query_mol = Chem.Mol(query)

        try:
            AlignMol(temp_query_mol, reference, atomMap=query_ligand_map)
        except:
            print(f"Failed to align {MolToSmiles(query)}")
            ALIGNMENT_FAILURES.append(temp_query_mol)

    return bestmol


def get_substructure_matches(molecule, mcs_mol, uniquify=True) -> Chem.Mol:
    return molecule.GetSubstructMatches(mcs_mol, uniquify=uniquify)


def get_dist_geom_calculator(
    reference, query_matches, reference_match, constrained_atoms
):
    geom_calc = rdDistGeom.ETKDGv3()
    geom_calc.trackFailures = True
    geom_calc.coordMap = {
        query_matches[atom_idx]: reference.GetConformer().GetAtomPosition(
            reference_match[atom_idx]
        )
        for atom_idx in range(len(constrained_atoms))
    }

    return geom_calc


def get_initial_poses(query, reference, max_symmetry=5):
    """
    Use maximum common substructure between query and template ligand to generate initial docking poses
    of query ligand

    """
    mcs_mol = get_mcs(query, reference)
    if not mcs_mol:
        MCS_FAILURES.append(mcs_mol)
        return []
    reference_matches = get_diverse_substructure_matches(reference, mcs_mol)
    molecule_matches = get_substructure_matches(query, mcs_mol)

    constrained_atoms = molecule_matches[0]
    constrained_atom_ids = get_tethered_atoms(constrained_atoms)

    molhits = query.GetSubstructMatch(mcs_mol)

    poses = []

    n_iterations = min(len(reference_matches), max_symmetry)

    for i in range(n_iterations):
        geom_calc = get_dist_geom_calculator(
            reference, molhits, reference_matches[i], constrained_atoms
        )

        posed_mol = get_template_aligned_pose(
            query,
            reference,
            [
                (m_idx, r_idx)
                for m_idx, r_idx in zip(molhits, reference_matches[i])
            ],
            geom_calc,
        )
        if posed_mol:
            posed_mol.SetProp("TETHERED ATOMS", constrained_atom_ids)
            poses.append(posed_mol)

    return poses

## 1.0) Virtual screen library
This section contains the section for downloading a virtual screen library from Zinc20 and converting it into RDKit molecules we can constrain

In [8]:
def load_zinc20_virtual_screen_library(url) -> List[Chem.Mol]:
    ligands = []
    with requests.get(url, stream=True) as response:
        response.raise_for_status()

        lines = (line.decode("utf-8") for line in response.iter_lines())
        reader = csv.DictReader(lines)

        for row in reader:
            if "smiles" in row:
                mol = MolFromSmiles(row["smiles"])
                mol = Chem.AddHs(mol)
                # For reproducibility. In principle it won't matter much because we're moving it to fit the template
                Chem.rdDistGeom.EmbedMolecule(mol, 1, randomSeed=0xF00D)
                mol.SetProp("ZINC_ID", row["zinc_id"])

                ligands.append(mol)

    return ligands


query_ligands = load_zinc20_virtual_screen_library(VIRTUAL_SCREEN_LIBRARY_URL)

[01:18:03] UFFTYPER: Unrecognized atom type: S_6+6 (1)


In [9]:
suppl = SDMolSupplier(REFERENCE_LIGAND_FILEPATH)
reference_ligand = suppl[0]
assert reference_ligand is not None



## 1.1) Generate initial poses
In this stage, we generate initial constrained poses from our query ligands that we will feed to our docking pipeline.

In [10]:
from mpire import WorkerPool
import multiprocessing

poses = []


def process_query_ligand(query_ligand):
    """Function to process each query ligand with get_initial_poses"""
    # print(MolToSmiles(query_ligand))
    return get_initial_poses(query_ligand, reference_ligand)


num_processes = max(multiprocessing.cpu_count() - 5, 1)

with WorkerPool(n_jobs=num_processes, enable_insights=True) as pool:
    poses.extend(pool.map_unordered(process_query_ligand, query_ligands))

Failed to align [H]c1c(OC([H])([H])[C@@]2([H])OC(=O)N([H])C2([H])[H])c([H])c(C([H])([H])[H])c([H])c1C([H])([H])[H]
Failed to align [H]c1c(OC([H])([H])[C@@]2([H])OC(=O)N([H])C2([H])[H])c([H])c(C([H])([H])[H])c([H])c1C([H])([H])[H]
Failed to align [H]N([H])C12C([H])([H])C3([H])C([H])([H])[C@@](C([H])([H])[H])(C1([H])[H])C([H])([H])[C@](C([H])([H])[H])(C3([H])[H])C2([H])[H]Failed to align [H]c1c([H])c([H])c(C(=O)[C@]([H])(N(C([H])([H])C([H])([H])[H])C([H])([H])C([H])([H])[H])C([H])([H])[H])c([H])c1[H]

Failed to align [H]c1c(OC([H])([H])[C@@]2([H])OC(=O)N([H])C2([H])[H])c([H])c(C([H])([H])[H])c([H])c1C([H])([H])[H]
Failed to align [H]Oc1c([H])c([H])c(O[C@]2([H])O[C@]([H])(C([H])([H])O[H])[C@@]([H])(O[H])[C@]([H])(O[H])[C@@]2([H])O[H])c([H])c1[H]
Failed to align [H]c1c([H])c([H])c2c(c1[H])C(=O)N([C@@]1([H])C(=O)N([H])C(=O)C([H])([H])C1([H])[H])C2=O
Failed to align [H]c1c(OC([H])([H])[C@@]2([H])OC(=O)N([H])C2([H])[H])c([H])c(C([H])([H])[H])c([H])c1C([H])([H])[H]
Failed to align [H]N([H])C12

In [11]:
poses = [pose for pose in poses if pose]
poses

[[<rdkit.Chem.rdchem.Mol at 0x7fffc9b9eb10>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9ecf0>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9ed90>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9eed0>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e390>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffd08d1120>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e8e0>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9ef20>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffd08cbdd0>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9c810>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e610>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e9d0>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e2f0>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9d800>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e7a0>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9eb60>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e890>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e3e0>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e340>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e6b0>],
 [<rdkit.Chem.rdchem.Mol at 0x7fffc9b9e4d0>,
  <rdkit.Chem.rdchem.Mol at 0x7fffc

## 1.2) Write poses to SDF files
In this stage we write the poses calculated by our constrained docking into SDF files.

In [14]:
INITIAL_POSE_PATH = Path.cwd() / "initial_poses"
if os.path.exists(INITIAL_POSE_PATH):
    shutil.rmtree(INITIAL_POSE_PATH)
os.makedirs(INITIAL_POSE_PATH)

In [15]:
molecule_hashes = {}
os.chdir(INITIAL_POSE_PATH)
for query_poses in poses:
    for query_pose in query_poses:
        mol_hash = hash(query_pose)
        molecule_hashes[mol_hash] = query_pose

        filename = f"{mol_hash}.sdf"
        writer = Chem.SDWriter(filename)
        writer.write(query_pose)
        writer.close()

## 1.3) Run constrained docking via the Rush platform
In this step, we use rxdock and gnina to run our constrained docking workflow.

In [16]:
os.chdir(WORK_DIR)
rxdock_outputs = []

TETHERED_DOCKING_CONFIGURATION = {
    "rot_mode": "TETHERED",
    "trans_mode": "TETHERED",
    "max_rot": 3.0,
    "max_trans": 1.0,
}
for pose in molecule_hashes:
    print(pose)
    (conformers, scores, sdf) = await client.rxdock(
        None,
        None,
        {"n_runs": 10},
        TETHERED_DOCKING_CONFIGURATION,
        None,
        PROTEIN_TARGET_FILEPATH,
        INITIAL_POSE_PATH / f"{pose}.sdf",
    )
    rxdock_outputs.append((sdf, pose))

await asyncio.gather(
    *(
        sdf.download(
            filename=f"{pose}_rxdock.sdf",
        )
        for sdf, pose in rxdock_outputs
    )
)

8796036112049
8796036112079
8796036112089
8796036112109
8796036111929
8796043268370
8796036112014
8796036112114
8796043267037
8796036111489
8796036111969
8796036112029
8796036111919
8796036111744
8796036111994
8796036112054
8796036112009
8796036111934
8796036111924
8796036111979
2024-05-14 01:20:03,924 - rush - INFO - Argument 7bae9de4-19a9-4326-be55-6b66923ec30b is now ModuleInstanceStatus.ADMITTED
2024-05-14 01:20:04,033 - rush - INFO - Argument 5e8b72ba-8a7d-4fcc-8e91-60fd3eb89b5e is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:20:04,063 - rush - INFO - Argument a074bc77-b987-4e92-aeeb-db305658ad66 is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:20:04,068 - rush - INFO - Argument e1c2df85-08b6-41ee-a122-db647f334b19 is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:20:04,082 - rush - INFO - Argument cd511845-1c88-4112-a57a-a16d05eaa326 is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:20:04,084 - rush - INFO - Argument 4155f66d-e767-494b-aea8-efedce8d099d is now ModuleIn

[PosixPath('objects/8796036112049_rxdock.sdf'),
 PosixPath('objects/8796036112079_rxdock.sdf'),
 PosixPath('objects/8796036112089_rxdock.sdf'),
 PosixPath('objects/8796036112109_rxdock.sdf'),
 PosixPath('objects/8796036111929_rxdock.sdf'),
 PosixPath('objects/8796043268370_rxdock.sdf'),
 PosixPath('objects/8796036112014_rxdock.sdf'),
 PosixPath('objects/8796036112114_rxdock.sdf'),
 PosixPath('objects/8796043267037_rxdock.sdf'),
 PosixPath('objects/8796036111489_rxdock.sdf'),
 PosixPath('objects/8796036111969_rxdock.sdf'),
 PosixPath('objects/8796036112029_rxdock.sdf'),
 PosixPath('objects/8796036111919_rxdock.sdf'),
 PosixPath('objects/8796036111744_rxdock.sdf'),
 PosixPath('objects/8796036111994_rxdock.sdf'),
 PosixPath('objects/8796036112054_rxdock.sdf'),
 PosixPath('objects/8796036112009_rxdock.sdf'),
 PosixPath('objects/8796036111934_rxdock.sdf'),
 PosixPath('objects/8796036111924_rxdock.sdf'),
 PosixPath('objects/8796036111979_rxdock.sdf')]

In [17]:
gnina_results = []

for pose in molecule_hashes:
    (docked_ligands, results) = await client.gnina_pdb(
        PROTEIN_TARGET_FILEPATH,
        Path.cwd() / "objects" / f"{pose}_rxdock.sdf",
        INITIAL_POSE_PATH / f"{pose}.sdf",
        {"num_modes": 10, "exhaustiveness": 8, "minimise": True},
    )
    gnina_results.append((pose, results, docked_ligands))

await asyncio.gather(*([output[1].get() for output in gnina_results]))

await asyncio.gather(
    *(
        [
            output[2].download(filename=f"{output[0]}_gnina.sdf", overwrite=True)
            for output in gnina_results
        ]
    )
)

2024-05-14 01:35:58,341 - rush - INFO - Argument dde77355-43f6-457c-968b-93a165f0e6e7 is now ModuleInstanceStatus.DISPATCHED
2024-05-14 01:35:58,477 - rush - INFO - Argument 558638b0-470b-475e-9a91-d5cbffced3e4 is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:35:58,478 - rush - INFO - Argument 05def68d-f5ec-420e-a312-3151ccaca942 is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:35:58,604 - rush - INFO - Argument dc4e617a-a108-4f1c-b3fa-d0ba0241c8f1 is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:35:58,609 - rush - INFO - Argument 787257fa-6f97-4849-8a0b-0dad9c59219d is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:35:58,622 - rush - INFO - Argument 205f31ad-38d3-4a78-be52-4c29c5295f6b is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:35:58,623 - rush - INFO - Argument 23ac01fe-f468-4752-a7cd-6990515df2a0 is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:35:58,624 - rush - INFO - Argument e826a6c2-b68e-4b64-ab1e-fb70e0a010fb is now ModuleInstanceStatus.RESOLVING
2024-05

[PosixPath('objects/8796036112049_gnina.sdf'),
 PosixPath('objects/8796036112079_gnina.sdf'),
 PosixPath('objects/8796036112089_gnina.sdf'),
 PosixPath('objects/8796036112109_gnina.sdf'),
 PosixPath('objects/8796036111929_gnina.sdf'),
 PosixPath('objects/8796043268370_gnina.sdf'),
 PosixPath('objects/8796036112014_gnina.sdf'),
 PosixPath('objects/8796036112114_gnina.sdf'),
 PosixPath('objects/8796043267037_gnina.sdf'),
 PosixPath('objects/8796036111489_gnina.sdf'),
 PosixPath('objects/8796036111969_gnina.sdf'),
 PosixPath('objects/8796036112029_gnina.sdf'),
 PosixPath('objects/8796036111919_gnina.sdf'),
 PosixPath('objects/8796036111744_gnina.sdf'),
 PosixPath('objects/8796036111994_gnina.sdf'),
 PosixPath('objects/8796036112054_gnina.sdf'),
 PosixPath('objects/8796036112009_gnina.sdf'),
 PosixPath('objects/8796036111934_gnina.sdf'),
 PosixPath('objects/8796036111924_gnina.sdf'),
 PosixPath('objects/8796036111979_gnina.sdf')]

## 1.4) Sorting and presentation of results
In this section, we sort and display the top scored molecules from our virtual screen, including the QDXF conformers for each of our poses.

In [18]:
POSE = 0
GNINA_SCORES = 1
GNINA_SDF_RESULTS = 2

In [19]:
# Destructure the gnina scores from the Rush output
gnina_results = [
    (result[POSE], result[GNINA_SCORES].value, result[GNINA_SDF_RESULTS])
    for result in gnina_results
]


def find_best_pose(sublist, sort_key=SORT_BY):
    return max(item[sort_key.value] for item in sublist[1])


sorted_hits = sorted(gnina_results, key=find_best_pose, reverse=True)

In [20]:
posed_conformers = []
for result in gnina_results:
    (conformers,) = await client.convert("SDF", result[GNINA_SDF_RESULTS])
    posed_conformers.append((result[POSE], conformers))
await asyncio.gather(
    *(
        [
            output[1].download(
                filename=f"{output[0]}_gnina_conformer.json", overwrite=True
            )
            for output in posed_conformers
        ]
    )
)

2024-05-14 01:41:09,686 - rush - INFO - Argument bcbf0b2f-1f00-4255-bd21-579b6336a6de is now ModuleInstanceStatus.ADMITTED
2024-05-14 01:41:09,852 - rush - INFO - Argument 890a2a69-daff-47c7-8eb9-5d732f43bbc7 is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:41:09,964 - rush - INFO - Argument ae70b5c1-cea9-4dac-8bfb-51845d86b28f is now ModuleInstanceStatus.ADMITTED
2024-05-14 01:41:09,979 - rush - INFO - Argument 04c0d86d-3bea-4e12-969d-f5be615185ba is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:41:10,004 - rush - INFO - Argument 21148b46-d72e-4532-a224-6128449b87a0 is now ModuleInstanceStatus.ADMITTED
2024-05-14 01:41:10,007 - rush - INFO - Argument 5e2debea-df81-4633-a50e-096ec02d602b is now ModuleInstanceStatus.ADMITTED
2024-05-14 01:41:10,015 - rush - INFO - Argument 3f2c19a1-0408-40a9-a9c9-5405493f87ca is now ModuleInstanceStatus.RESOLVING
2024-05-14 01:41:10,017 - rush - INFO - Argument 39ad93cf-dca2-4553-9c6d-43efeea24152 is now ModuleInstanceStatus.RESOLVING
2024-05-14 0

[PosixPath('objects/8796036112049_gnina_conformer.json'),
 PosixPath('objects/8796036112079_gnina_conformer.json'),
 PosixPath('objects/8796036112089_gnina_conformer.json'),
 PosixPath('objects/8796036112109_gnina_conformer.json'),
 PosixPath('objects/8796036111929_gnina_conformer.json'),
 PosixPath('objects/8796043268370_gnina_conformer.json'),
 PosixPath('objects/8796036112014_gnina_conformer.json'),
 PosixPath('objects/8796036112114_gnina_conformer.json'),
 PosixPath('objects/8796043267037_gnina_conformer.json'),
 PosixPath('objects/8796036111489_gnina_conformer.json'),
 PosixPath('objects/8796036111969_gnina_conformer.json'),
 PosixPath('objects/8796036112029_gnina_conformer.json'),
 PosixPath('objects/8796036111919_gnina_conformer.json'),
 PosixPath('objects/8796036111744_gnina_conformer.json'),
 PosixPath('objects/8796036111994_gnina_conformer.json'),
 PosixPath('objects/8796036112054_gnina_conformer.json'),
 PosixPath('objects/8796036112009_gnina_conformer.json'),
 PosixPath('ob

In [21]:
final_output = []
for result in gnina_results:
    mhash = result[POSE]
    file_path = Path.cwd() / "objects" / f"{mhash}_gnina_conformer.json"
    conformers = json.loads(file_path.read_text())
    scores = result[GNINA_SCORES]
    final_output.append(
        (
            mhash,
            molecule_hashes[mhash],
            [{**i, **j} for i, j in zip(conformers, scores)],
        )
    )

## 1.4.1) Final table presentation
In this output, we display the outputs of our virtual screen. `rxdock` produces 10 poses per target ligand, but this protocol only displays only 2 poses by default.

In [22]:
N_TOP_HITS = 10
N_POSES_PER_MOLECULE = 2

for molecule in final_output[0:N_TOP_HITS]:
    print("=" * 50)
    print(f"{MolToSmiles(molecule[1])}, hash: {molecule[0]}")

    for idx, pose in enumerate(molecule[2][0:N_POSES_PER_MOLECULE]):
        print(f"POSE {idx}")
        print(f"CNN score: {pose['cnn_score']}")
        print(f"Affinity: {pose['affinity']}")
        print(f"CNN affinity: {pose['cnn_affinity']}")
        # print(f'{pose['topology']['symbols'][0:5]}') if you want to inspect the output QDXF conformer
        print("-" * 50)

[H]c1c(OC(F)(F)F)c([H])c2sc(N([H])[H])nc2c1[H], hash: 8796036112049
POSE 0
CNN score: 0.78999656
Affinity: -7.91674
CNN affinity: 5.2846885
--------------------------------------------------
POSE 1
CNN score: 0.7427761
Affinity: -7.14572
CNN affinity: 5.297982
--------------------------------------------------
[H]c1c([H])c([H])c(C(=O)c2c([H])c([H])c([H])c([H])c2[H])c([H])c1[H], hash: 8796036112079
POSE 0
CNN score: 0.79343325
Affinity: -8.49743
CNN affinity: 4.441545
--------------------------------------------------
POSE 1
CNN score: 0.7437881
Affinity: -8.34523
CNN affinity: 4.2680154
--------------------------------------------------
[H]OC([H])([H])[C@@]([H])(O[H])C([H])([H])Oc1c([H])c([H])c([H])c([H])c1OC([H])([H])[H], hash: 8796036112089
POSE 0
CNN score: 0.31951967
Affinity: -5.85809
CNN affinity: 4.20428
--------------------------------------------------
POSE 1
CNN score: 0.21155676
Affinity: -5.71027
CNN affinity: 4.047934
--------------------------------------------------
[H]c