## Icolos Docking Workflow Demo
Icolos can perform automated docking, with support for advanced features such as ensemble docking and pose rescoring. 

In this notebook, we demonstrate a minimal working example of a docking workflow, using LigPrep and Glide. More complex workflows examples, including rescoring methods and ensemble docking, and a comprehensive list of additional settings can be found in the documentation.

Files required to execute the workflow are provided in the accompanying IcolosData repository, available at https://github.com/MolecularAI/IcolosData.

Note, we provide an `icoloscommunity` environment which should be used for this notebook.  It contains the jupyter dependencies in addition to the Icolos production environment requirements, allowing you to execute workflows from within the notebook

### Step 1: Prepare input files
The following files are required to start the docking run
* Receptor grid (normally prepared in the Maestro GUI)
* smiles strings for the compounds to dock, in `.smi` or `.csv` format
* Icolos config file: a `JSON` file containing the run settings.  Templates for the most common workflows can be found in the `examples` folder of main Icolos repository.

In [13]:
import os
import json
import subprocess
import pandas as pd

# set up some file paths to use the provided test data
# please ammend as appropriate
icolos_path = "~/Icolos"
data_dir = "~/IcolosData"
output_dir = "../output"
config_dir = "../config/docking"
for path in [output_dir, config_dir]:
    if not os.path.isdir(path):
        os.makedirs(path)
grid_path = os.path.expanduser(os.path.join(data_dir, "Glide/1UYD_grid_constraints.zip"))
smiles_path = os.path.expanduser(os.path.join(data_dir, "molecules/paracetamol.smi"))




In [14]:
conf={
    "workflow": {
        "header": {
            "workflow_id": "docking_minimal",
            "description": "demonstration docking job with LigPrep + Glide",
            "environment": {
                "export": [
                ]
            },
            "global_variables": {
                
            }
        },
        "steps": [{
                "step_id": "initialization_smile",
                "type": "initialization",
                "input": {
                    # specify compounds parsed from the .smi file
                    "compounds": [{
                            "source": smiles_path,
                            "source_type": "file",
                            "format": "SMI"
                        }
                    ]
                }
            }, {
                "step_id": "Ligprep",
                "type": "ligprep",
                "execution": {
                    "prefix_execution": "module load schrodinger/2021-2-js-aws",
                    "parallelization": {
                        "cores": 2,
                        "max_length_sublists": 1
                    },
                    # automatic resubmission on job failure
                    "failure_policy": {
                        "n_tries": 3
                    }
                },
                "settings": {
                    "arguments": {
                        # flags and params passed straight to LigPrep
                        "flags": ["-epik"],
                        "parameters": {
                            "-ph": 7.0,
                            "-pht": 2.0,
                            "-s": 10,
                            "-bff": 14
                        }
                    },
                    "additional": {
                        "filter_file": {
                            "Total_charge": "!= 0"
                        }
                    }
                },
                "input": {
                    # load initialized compounds from the previous step
                    "compounds": [{
                            "source": "initialization_smile",
                            "source_type": "step"
                        }
                    ]
                }
            }, {
                "step_id": "Glide",
                "type": "glide",
                "execution": {
                    "prefix_execution": "module load schrodinger/2021-2-js-aws",
                    "parallelization": {
                        "cores": 4,
                        "max_length_sublists": 1
                    },
                    "failure_policy": {
                        "n_tries": 3
                    }
                },
                "settings": {
                    "arguments": {
                        "flags": [],
                        "parameters": {
                            "-HOST": "cpu-only"
                        }
                    },
                    "additional": {
                        # glide configuration for the .in file
                        "configuration": {
                            "AMIDE_MODE": "trans",
                            "EXPANDED_SAMPLING": "True",
                            "GRIDFILE": [grid_path],
                            "NENHANCED_SAMPLING": "1",
                            "POSE_OUTTYPE": "ligandlib_sd",
                            "POSES_PER_LIG": "3",
                            "POSTDOCK_NPOSE": "25",
                            "POSTDOCKSTRAIN": "True",
                            "PRECISION": "SP",
                            "REWARD_INTRA_HBONDS": "True"
                        }
                    }
                },
                "input": {
                    # take embedded compounds from the previous step
                    "compounds": [{
                            "source": "Ligprep",
                            "source_type": "step"
                        }
                    ]
                },
                "writeout": [
                    # write a sdf file with all conformers
                    {
                        "compounds": {
                            "category": "conformers"
                        },
                        "destination": {
                            "resource": os.path.join(output_dir,"docked_conformers.sdf"),
                            "type": "file",
                            "format": "SDF"
                        }
                    },
                    # write a csv file with the top docking score per compound
                    {
                        "compounds": {
                            "category": "conformers",
                            "selected_tags": ["docking_score"],
                            "aggregation": {
                                "mode": "best_per_compound",
                                "key": "docking_score"
                            }
                        },
                        "destination": {
                            "resource": os.path.join(output_dir, "docked_conformers.csv"),
                            "type": "file",
                            "format": "CSV"
                        }
                    }
                ]
            }
        ]
    }
}


with open(os.path.join(config_dir, "docking_conf.json"), 'w') as f:
    json.dump(conf, f, indent=4)

The workflow can be executed by running the following command (with paths ammended as necessary), in a terminal. 

In [15]:
# this run will take a few minutes to complete
icolos_executor = os.path.join(icolos_path, "executor.py")
docking_conf = os.path.join(config_dir, "docking_conf.json")

command = f"python {icolos_executor} -conf {docking_conf}"
subprocess.run(command, shell=True)



  df.columns.str.strip()


CompletedProcess(args='python ~/Icolos/executor.py -conf ../config/docking/docking_conf.json', returncode=0)

We will briefly inspect the results files

In [16]:
results = pd.read_csv(os.path.join(output_dir, "docked_conformers.csv"))
results.head()

Unnamed: 0,_Name,compound_name,docking_score
0,0:0:2,Paracetamol,-6.02349
