# Stable species conformer search

Leverage ETKDG and GeoMol as 3D geometry embedder for stochastic conformer generation

The idea is to have modular methods for each step, which are currently hardcoded. This includes:
- initial conformer embedding (ETKDG, GeoMol)
- optimization/energy (MMFF, UFF, GFN-FF, GFN2-xTB)
- pruning (torsion fingerprints, CREGEN)
- convergence metrics (conformational entropy/partition function)

In [None]:
from rdmc.conformer_generation.embedders import *
from rdmc.conformer_generation.optimizers import *
from rdmc.conformer_generation.pruners import *
from rdmc.conformer_generation.metrics import *
from rdmc.conformer_generation.generators import StochasticConformerGenerator
from rdmc.conformer_generation.utils import dict_to_mol

from rdmc import RDKitMol
from rdtools.view import mol_viewer, interactive_conformer_viewer, conformer_viewer

T = 298  # K
R = 0.0019872  # kcal/(K*mol)
HARTREE_TO_KCAL_MOL = 627.503

%load_ext autoreload
%autoreload 2

## 1. Test embedder

Create the 3D geometry for the molecule specified by the SMILES (`smi`). Currently it has no 3D conformer embedded, therefore the visualization returns a 2D illustration of the molecule

In [None]:
smi = "[C:1]([C@@:2]([O:3][H:12])([C:4]([N:5]([C:6](=[O:7])[H:16])[H:15])([H:13])[H:14])[H:11])([H:8])([H:9])[H:10]"  # example 1
smi = "CN1C2=C(C=C(C=C2)Cl)C(=NCC1=O)C3=CC=CC=C3"  # example 2

mol_viewer(RDKitMol.FromSmiles(smi))

### 1.1 ETKDG embedder

In [None]:
n_confs = 10  # Number of conformers to create

embedder = ETKDGEmbedder()  # Initialize conformer embedder
unique_mol_data = embedder(smi, n_confs)  # Embed molecule 3D geometries with ETKDG
mol = dict_to_mol(unique_mol_data)  # Convert raw data to a molecule object

In [None]:
visualize_conf_id = 2

mol_viewer(mol, conf_id=visualize_conf_id)  # visualize the molecule

### 1.2 GeoMol Conformer

You can skip this block if you don't have GeoMol installed. To install GeoMol,
```
git clone https://github.com/xiaoruiDong/GeoMol  # Clone GeoMol repo
cd GeoMol  # Go to the GeoMol repo
make  # install select the cuda version if asked
pip install -e .  # install geomol
```

Supported options:
- `dataset`: `drug` or `qm9`
- `device`: `cpu`, or `cuda` (or specific cuda device like `cuda:0`)

In [None]:
n_confs = 10  # Number of conformers to create
dataset = "drugs"
device = "cuda"

embedder = GeoMolEmbedder(dataset=dataset, track_stats=True, temp_schedule="none", device=device) # Initialize conformer embedder
unique_mol_data = embedder(smi, n_confs)  # Embed molecule 3D geometries with ETKDG
mol = dict_to_mol(unique_mol_data)  # Convert raw data to a molecule object

In [None]:
visualize_conf_id = 2

mol_viewer(mol, conf_id=visualize_conf_id)  # visualize the molecule

## 2. Create a conformer generation workflow

### 2.1 Choose each components
- embedder
- optimizer
- pruner
- metric

you can also use default config by providing `config` to the generator. You can open a new cell and use `StochasticConformerGenerator.set_config?` to check what is the default configuration

In [None]:
# Embedder
embedder = ETKDGEmbedder(track_stats=True)
# if you installed GeoMol, you can uncomment the following line
# embedder = GeoMolEmbedder(dataset="drugs", track_stats=True, temp_schedule="none", device="cpu") # Initialize conformer embedder

# Optimizer:
optimizer = MMFFOptimizer()
# if you installed XTB, you can uncomment the following line
# optimizer = XTBOptimizer()

# Pruner
pruner = TorsionPruner(max_chk_threshold=30)

# Metric
metric = SCGMetric(metric="entropy", window=5, threshold=0.005)

### 2.2 Conformer generation

In [None]:
smi = "CN1C2=C(C=C(C=C2)Cl)C(=NCC1=O)C3=CC=CC=C3"

mol_viewer(RDKitMol.FromSmiles(smi))

In [None]:
n_conformers_per_iter = 100
min_iters = 2
max_iters = 5

scg = StochasticConformerGenerator(
    smiles=smi,
    embedder=embedder,
    optimizer=optimizer,
    pruner=pruner,
    metric=metric,
    min_iters=min_iters,
    max_iters=max_iters,
)

unique_mol_data = scg(n_conformers_per_iter)
print(
    f"Number of conformers: {len(unique_mol_data)}\n"
    f"Metric: {scg.metric.metric_history[-1]:.3e}"
)