# Working with RDKit

[RDKit](https://www.rdkit.org) is a fantastic open source cheminformatics package. It provides a wealth of tools, including powerful functionality for working with [smiles](https://github.com/suneelbvs/rdkit_tutorials/blob/master/1_Reading%20and%20Writing%20Smiles%20using%20rdKit.ipynb) and drawing [two-dimensional representations of molecules](https://www.rdkit.org/docs/Cookbook.html).

In the [2023.2 release of sire](https://sire.openbiosim.org) we new added [sire.convert](https://sire.openbiosim.org/tutorial/index_part05.html) functions. These enables [sire](https://sire.openbiosim.org) to interconvert with [RDKit](https://www.rdkit.org), thereby adding both [smiles](https://sire.openbiosim.org/tutorial/part05/03_smiles.html) and [2D visualisation support](https://sire.openbiosim.org/tutorial/part05/02_view.html).

For example, here we could use [RDKit](https://www.rdkit.org) directly to create a molecule from a smiles string, and to generate optimised 3D coordinates.

In [None]:
from rdkit import Chem
from rdkit.Chem import AllChem

rdkit_mol = Chem.MolFromSmiles(
    "CC1([C@@H](N2[C@H](S1)[C@@H](C2=O)NC(=O)CC3=CC=CC=C3)C(=O)O)C")

rdkit_mol = AllChem.AddHs(rdkit_mol)
AllChem.EmbedMolecule(rdkit_mol)
AllChem.UFFOptimizeMolecule(rdkit_mol)

rdkit_mol

With [sire.convert](https://sire.openbiosim.org/api/index_convert.html) we can convert this [RDKit](https://www.rdkit.org) molecule into a [sire](https://sire.openbiosim.org) molecule! This allows us to use [sire's integration with NGLView](https://sire.openbiosim.org/cheatsheet/view.html#id1) to get a nice 3D view of the molecule.

In [None]:
import sire as sr

sr_mol = sr.convert.to(rdkit_mol, "sire")
sr_mol.view()

To make things easier, we've wrapped up the [RDKit](https://www.rdkit.org) code, and put it behind a new function, [sire.smiles](https://sire.openbiosim.org/api/sire.html#sire.smiles).

In [None]:
sr_mol = sr.smiles(
    "CC1([C@@H](N2[C@H](S1)[C@@H](C2=O)NC(=O)CC3=CC=CC=C3)C(=O)O)C")

sr_mol.view()

Conversions can go both ways. This means that we can use [sire.convert](https://sire.openbiosim.org/api/index_convert.html) to go back from the [sire](https://sire.openbiosim.org) molecule to the [RDKit](https://www.rdkit.org) molecule.

In [None]:
rdkit_mol = sr.convert.to(sr_mol, "rdkit")
rdkit_mol

[RDKit](https://www.rdkit.org) has great functionality for generating the smiles strings from molecules, e.g. using [rdkit.Chem.MolToSmiles](https://rdkit.org/docs/source/rdkit.Chem.rdmolfiles.html#rdkit.Chem.rdmolfiles.MolToSmiles).

In [None]:
Chem.MolToSmiles(rdkit_mol)

We have wrapped up this code and exposed it as a new [.smiles](https://sire.openbiosim.org/api/mol.html#sire.mol.SelectorMol.smiles) function on all of our molecular containers.

In [None]:
sr_mol.smiles(include_hydrogens=True)

Here, we included hydrogens, which makes the smiles string a bit long! By default, we only include hydrogens that are needed to resolve structural ambiguities.

In [None]:
sr_mol.smiles()

In addition to creating smiles strings, [RDKit](https://www.rdkit.org) can also be used to create [2D images of molecules](https://www.rdkit.org/docs/Cookbook.html). We've wrapped up this code into a new [.view2d](https://sire.openbiosim.org/cheatsheet/view.html#d-views) function, which is also available on all molecular containers.

In [None]:
sr_mol.view2d()

The powerful thing is that these functions work *even if* the molecule was not originally created by [RDKit](https://www.rdkit.org) or from a smiles string. For example, let's load cholesterol from an SDF file. Cholesterol is included as a [tutorial molecule](https://sire.openbiosim.org/tutorial/part01/02_loading_a_molecule.html#loading-from-files), so we can download it from [sire's tutorial site](https://sire.openbiosim.org/tutorial/part01/05_loading_from_multiple_files.html#loading-from-multiple-files).

In [None]:
mols = sr.load(sr.expand(sr.tutorial_url, "cholesterol.sdf"))
mols.view()

In [None]:
mols.smiles()

In [None]:
mols.view2d()

And, of course, you can convert this molecule to [RDKit](https://rdkit.org), so you can take advantage of all of the other [amazing functionality that RDKit provides!](https://www.rdkit.org/docs/Cookbook.html)

In [None]:
rdkit_mol = sr.convert.to(mols, "rdkit")
rdkit_mol

If you want to try this yourself, please feel free connect to [try.openbiosim.org](https://try.openbiosim.org) and starting a notebook. You can download the notebook used to generate this post onto the server by running this command in one of the notebook code cells.

```
! wget https://github.com/OpenBioSim/posts/raw/main/sire/002_rdkit/rdkit.ipynb
```

The new [sire.convert](https://sire.openbiosim.org/tutorial/index_part05.html) functionality can be used to convert to the native types of other popular molecular modelling packages. Look forward to our next blog post, where we will show you how [conversion to OpenMM](https://sire.openbiosim.org/cheatsheet/openmm.html) let's you run [minimisation and molecular dynamics](https://sire.openbiosim.org/tutorial/part05/04_dynamics.html)!