#UMADock from CafChem tools. Docks a ligand from a SMILES string into a pre-defined protein binding site (Take from the DuDE structures) and calculates electroninc binding energy with Meta's UMA MLIP.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/MauricioCafiero/UMADock/blob/main/notebooks/UMA_Dock.ipynb)

## This notebook allows you to:
- Input a smiles string and generate a set number of conformers.
- Dock the conformers in one of several available protein binding sites (DRD2, HMGCR, MAOB, ADRB2)
- Optimize the best pose from each conformer and
- Calculate the electronic binding energy with the UMA MLIP (including ligand desolvation and strain energy).
- visualize the best pose.

## Requirements:
- This notebook will install several py3Dmol, Fairchem-core and rdkit.
- Runs well on an L4 GPU; will run faster on an A100.

## Set-up

### Install libraries

In [1]:
!pip install -q fairchem-core
!pip install -q py3Dmol
!pip install -q rdkit

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/304.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m304.5/304.5 kB[0m [31m11.6 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.9 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m2.9/2.9 MB[0m [31m143.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.9/2.9 MB[0m [31m78.5 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/42.9 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.9/42.9 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/449.9 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

### Get CafChem from Github, import libraries, download UMA

In [11]:
!git clone https://github.com/MauricioCafiero/CafChem.git

!git clone https://github.com/MauricioCafiero/UMADock.git

Cloning into 'CafChem'...
remote: Enumerating objects: 918, done.[K
remote: Counting objects: 100% (197/197), done.[K
remote: Compressing objects: 100% (139/139), done.[K
remote: Total 918 (delta 154), reused 58 (delta 58), pack-reused 721 (from 3)[K
Receiving objects: 100% (918/918), 45.14 MiB | 30.13 MiB/s, done.
Resolving deltas: 100% (528/528), done.
Cloning into 'UMADock'...
remote: Enumerating objects: 10, done.[K
remote: Counting objects: 100% (10/10), done.[K
remote: Compressing objects: 100% (10/10), done.[K
remote: Total 10 (delta 3), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (10/10), 11.90 KiB | 11.90 MiB/s, done.
Resolving deltas: 100% (3/3), done.


In [12]:
import py3Dmol
import os
import torch
from google.colab import files
import numpy as np
from fairchem.core import FAIRChemCalculator, pretrained_mlip

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import Draw

import UMADock.UMADock as ud

In [13]:
device = "cuda" if torch.cuda.is_available() else "cpu"

predictor = pretrained_mlip.get_predict_unit("uma-s-1", device=device)
calculator = FAIRChemCalculator(predictor, task_name="omol")
model = "UMA-OMOL"

## UMA Dock

In [14]:
test_confs = ud.conformers("O=C(O)[C@@](NN)(Cc1cc(O)c(O)cc1)C",20)
em_mols = test_confs.get_confs(use_random=True)
ex_mols = test_confs.expand_conf()
xyz_strings = test_confs.get_XYZ_strings()
confs = test_confs.prep_XYZ_docking()

In [15]:
ldopa_dock = ud.UMA_Dock(confs, 20, calculator, ud.DRD2_data)

There are 1 molecules with size: 216
for 2, 218


In [16]:
new_molecules, ies, distances = ldopa_dock.dock()

nudging
nudging
Nudged distance is: 4.687 and binding energy is: 139.999.
adding fragment: conf_0
nudging
Added 1 conf_0 fragments
nudging
Added 0 conf_1 fragments
nudging
Added 0 conf_2 fragments
nudging
Nudged distance is: 3.629 and binding energy is: 63.289.
adding fragment: conf_3
nudging
Added 1 conf_3 fragments
nudging
Nudged distance is: 1.604 and binding energy is: 55.260.
adding fragment: conf_4
nudging
Nudged distance is: 2.439 and binding energy is: 43.798.
adding fragment: conf_4
nudging
Nudged distance is: 2.550 and binding energy is: 205.093.
adding fragment: conf_4
Added 3 conf_4 fragments
nudging
nudging
nudging
Nudged distance is: 4.887 and binding energy is: 32.758.
adding fragment: conf_5
nudging
Nudged distance is: 2.740 and binding energy is: 203.321.
adding fragment: conf_5
Added 2 conf_5 fragments
Added 0 conf_6 fragments
Added 0 conf_7 fragments
nudging
nudging
Nudged distance is: 3.246 and binding energy is: 166.844.
adding fragment: conf_8
Added 1 conf_8 fragm

In [17]:
ies, ebes = ldopa_dock.post_process(criteria='distance')

1 files written for conf_0.
0 files written for conf_1.
0 files written for conf_2.
1 files written for conf_3.
3 files written for conf_4.
2 files written for conf_5.
0 files written for conf_6.
0 files written for conf_7.
1 files written for conf_8.
1 files written for conf_9.
0 files written for conf_10.
1 files written for conf_11.
0 files written for conf_12.
0 files written for conf_13.
3 files written for conf_14.
1 files written for conf_15.
1 files written for conf_16.
2 files written for conf_17.
1 files written for conf_18.
0 files written for conf_19.
best pose by distance for conf_0 is: 4.687 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

No poses for conf_1
No poses for conf_2
best pose by distance for conf_3 is: 3.629 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

best pose by distance for conf_4 is: 1.604 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

best pose by distance for conf_5 is: 2.740 at location: 1


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

No poses for conf_6
No poses for conf_7
best pose by distance for conf_8 is: 3.246 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

best pose by distance for conf_9 is: 1.367 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

No poses for conf_10
best pose by distance for conf_11 is: 3.291 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

No poses for conf_12
No poses for conf_13
best pose by distance for conf_14 is: 3.775 at location: 2


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

best pose by distance for conf_15 is: 3.904 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

best pose by distance for conf_16 is: 4.542 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

best pose by distance for conf_17 is: 4.375 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

best pose by distance for conf_18 is: 4.873 at location: 0


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

No poses for conf_19
Optimizing best pose for fragment conf_0.
Initial energy: -6149.436272 ha
      Step     Time          Energy          fmax
BFGS:    0 14:11:37  -167334.786563        7.707813
BFGS:    1 14:11:37  -167338.163799        7.842006
BFGS:    2 14:11:38  -167341.571229        5.589957
BFGS:    3 14:11:38  -167343.588710        3.638434
BFGS:    4 14:11:39  -167345.350692        2.250819
BFGS:    5 14:11:39  -167346.419291        1.715857
BFGS:    6 14:11:39  -167346.951669        1.722543
BFGS:    7 14:11:39  -167347.524084        2.116415
BFGS:    8 14:11:40  -167348.308798        2.467724
BFGS:    9 14:11:40  -167348.937409        5.669857
BFGS:   10 14:11:40  -167349.966244        3.824518
BFGS:   11 14:11:41  -167351.539832        3.632508
BFGS:   12 14:11:41  -167352.239915        2.488977
BFGS:   13 14:11:41  -167352.607370        1.748753
BFGS:   14 14:11:42  -167352.899591        1.219688
BFGS:   15 14:11:42  -167353.032931        1.146141
BFGS:   16 14:11:42  -1

In [20]:
i = 0
out_text = ""
for pre_opt, post_opt in zip(ies,ebes):
  out_text += f"Conformer {i} =========================================================================\n"
  if pre_opt!= -1.0:
    out_text += f"Optimized Docking energy: {pre_opt:10.3f}, Binding energy with desolvation and strain: {post_opt:10.3f}\n"
  else:
    out_text += "No good poses\n"
  i += 1

print(out_text)

Optimized Docking energy:    -26.271, Binding energy with desolvation and strain:    -10.942
No good poses
No good poses
Optimized Docking energy:    -10.073, Binding energy with desolvation and strain:     20.492
Optimized Docking energy:    -24.340, Binding energy with desolvation and strain:    -11.808
Optimized Docking energy:    -16.222, Binding energy with desolvation and strain:     -4.839
No good poses
No good poses
Optimized Docking energy:    -19.476, Binding energy with desolvation and strain:     -9.215
Optimized Docking energy:    -45.155, Binding energy with desolvation and strain:    -26.896
No good poses
Optimized Docking energy:    -21.581, Binding energy with desolvation and strain:     -1.952
No good poses
No good poses
Optimized Docking energy:    -15.722, Binding energy with desolvation and strain:      1.157
Optimized Docking energy:    -35.336, Binding energy with desolvation and strain:    -17.609
Optimized Docking energy:    -16.859, Binding energy with desolva

In [22]:
best_conf_idx = np.argmin(ebes)
best_energy = ebes[best_conf_idx]

best_pose_idx = np.argmin(distances[best_conf_idx])

print(f"The lowest elecronic binding energy came from conformer {best_conf_idx}, \
and pose {best_pose_idx} = {best_energy:.3f} kcal/mol")

ud.view_from_file(f'/content/opt_files/DRD2_w_conf_{best_conf_idx}{best_pose_idx}_OPTIMIZED.xyz',
                  ud.DRD2_data, confs[best_conf_idx])

The lowest elecronic binding energy came from conformer 9, and pose 0 = -26.896 kcal/mol
