# Docking Ligands to Proteins

> ### In this tutorial we will cover:
> - how we can use BuildAMol's `docking` extension to facilitate the docking process of Molecules

BuildAMol is intented to let users _build_ molecular structures in any way they like. But it also tries to facilitate downstream research workflows in order to provide a streamlined user experience. Therefore we added a `docking` extension in version `1.2.9` to facilitate the process of using [AutoDock Vina](https://github.com/ccsb-scripps/AutoDock-Vina) from Python.

The `docking` extension acts as a wrapper and forwards to different docking libraries. Currently available docking backends are: (1) `easydock` package by [Minibaeva et al. (2023)](https://jcheminf.biomedcentral.com/articles/10.1186/s13321-023-00772-2) and (2) `dockstring` by [García-Ortegón et al. (2022)](https://pubs.acs.org/doi/full/10.1021/acs.jcim.1c01334). Naturally, you will need to install these libraries and their dependencies in order to run the docking. Check out their GitHub pages to learn more about their installation (it's not difficult 😄).

That being said, let's get started!

## Loading a Protein and getting a Ligand

To keep things simple, we will use the protein and ligand from the [Ligand Design Tutorial](https://biobuild.readthedocs.io/en/latest/examples/ligand_design.html). 

In [1]:
import buildamol as bam

protein = bam.read_pdb("files/DRD2.pdb")
ligand = bam.read_pdb("files/DRD2_ligand.pdb")

In [None]:
v = protein.py3dmol("cartoon", color="white")
v += ligand.py3dmol("stick", color="blue")
v.show()

## Loading the Docking Backend

Next we import the `docking` extension. The central hub we can use then is the `dock` function. By default the backend will be set to `easydock` but we can change the backend using the `set_docking_backend` function. 

In [37]:
from buildamol.extensions import docking

Alternatively, we can directly import a specific backend using: 

```python
# directly import from whatever backend you prefer
from buildamol.extensions.docking import easydock
# or
from buildamol.extensions.docking import dockstring
```

In any case we now have access to a `dock` function that allows us to dock our ligand to our protein.

## Docking our Ligand (using _easydock_, default)

In order to dock our ligand, both `easydock` and `dockstring` need to know where we want to dock to - i.e. we need to pass some coordinates of a binding pocket. In this case, we luckily know where the binding pocket is: it's just where our current ligand is already docket 😉

In [38]:
# where to dock (only works because our ligand came from a previous docking run, normally we would
# need to figure out a binding site first or write some code to determine a binding site)
center = ligand.center_of_geometry

# define the box size for docking
size = (20, 20, 20)

# run docking
docked_poses = docking.dock(protein=protein, ligand=ligand, center=center, box_size=size)


Used element 'C' for Atom (name=C20) with given element 'A'


Used element 'C' for Atom (name=C21) with given element 'A'


Used element 'C' for Atom (name=C25) with given element 'A'


Used element 'C' for Atom (name=C22) with given element 'A'


Used element 'C' for Atom (name=C24) with given element 'A'


Used element 'C' for Atom (name=C23) with given element 'A'


Used element 'C' for Atom (name=C36) with given element 'A'


Used element 'C' for Atom (name=C37) with given element 'A'


Used element 'C' for Atom (name=C41) with given element 'A'


Used element 'C' for Atom (name=C38) with given element 'A'


Used element 'C' for Atom (name=C40) with given element 'A'


Used element 'C' for Atom (name=C39) with given element 'A'



`docked_poses` will be a `Molecule` instance with multiple `Model`s (one for each docked pose). 

> In case of `dockstring` each `Model` will additionally have a `docking_score` attribute (easydock sadly doesn't provide a per-pose docking score).

In [41]:
# let's check out the models (=poses)
print(docked_poses.models)

[Model(1), Model(2), Model(3), Model(4), Model(5)]


And with that we can visualize the results with py3dmol:

In [42]:
v = protein.py3dmol("cartoon", color="white")
v += ligand.py3dmol("stick", color="black")

# split the multi-model molecule into single-model molecules
docked_poses = docked_poses.split_models()
for pose, color in zip(docked_poses, ["red", "green", "blue", "yellow", "orange"]):
    v += pose.py3dmol("stick", color=color)
v.show()


The id `0` is already used for a sibling of this entity. Changing id from `2` to `0` might create access inconsistencies to children of the parent entity.


The id `0` is already used for a sibling of this entity. Changing id from `3` to `0` might create access inconsistencies to children of the parent entity.


The id `0` is already used for a sibling of this entity. Changing id from `4` to `0` might create access inconsistencies to children of the parent entity.


The id `0` is already used for a sibling of this entity. Changing id from `5` to `0` might create access inconsistencies to children of the parent entity.



And there we are! Honestly, the poses look somewhat worse when using `easydock` compared to the ones from `dockstring` (at least without tweaking the default parameters) but it's faster to compute. If you want to use `dockstring` instead here's how:

## Docking our Ligand (using _dockstring_)

To switch backends we simply need to call `set_docking_backend` with the argument `"dockstring"`. The rest of the procedure remains the same:

In [43]:
# switch docking backend
docking.set_docking_backend("dockstring")

# run docking
docked_poses = docking.dock(protein=protein, ligand=ligand, center=center, box_size=size)


Although Mac use is supported, docking scores on Mac do not always perfectly match scores from Linux. Therefore, extra care should be taken when comparing results to other platforms. In particular, the baselines in the DOCKSTRING paper were computed on Linux, so please do not directly compare your docking scores to the scores reported on the paper.



Since we are using `dockstring` now we also get access to the docking scores now:

In [45]:
for model in docked_poses.models:
    print(model, model.docking_score)

Model(0) -11.6
Model(1) -11.1
Model(2) -11.1
Model(3) -11.0
Model(4) -10.7
Model(5) -10.6
Model(6) -10.5
Model(7) -10.2
Model(8) -10.1


Actually, the models are by default, sorted from lowest to highest docking score (remember that a lower docking score is better).

In [48]:
# the visualize the docked poses


import matplotlib
colors = list(matplotlib.colors.CSS4_COLORS.keys())

v = protein.py3dmol("cartoon", color="white")
v += ligand.py3dmol("stick", color="black")

# split the multi-model molecule into single-model molecules
docked_poses = docked_poses.split_models()
for pose, color in zip(docked_poses, colors[:len(docked_poses)]):
    v += pose.py3dmol("stick", color=color)

v.show()


The id `0` is already used for a sibling of this entity. Changing id from `1` to `0` might create access inconsistencies to children of the parent entity.


The id `0` is already used for a sibling of this entity. Changing id from `2` to `0` might create access inconsistencies to children of the parent entity.


The id `0` is already used for a sibling of this entity. Changing id from `3` to `0` might create access inconsistencies to children of the parent entity.


The id `0` is already used for a sibling of this entity. Changing id from `4` to `0` might create access inconsistencies to children of the parent entity.


The id `0` is already used for a sibling of this entity. Changing id from `5` to `0` might create access inconsistencies to children of the parent entity.


The id `0` is already used for a sibling of this entity. Changing id from `6` to `0` might create access inconsistencies to children of the parent entity.


The id `0` is already used for a sibling of this entity. 

And that's it for this tutorial! Thanks for checking out this tutorial and hopefully you found it helpful for your research. Good luck in your project using BuildAMol!