##  Short Outlook: Cheminformatics with Python

Cheminformatics is the application of computer science to the field of chemistry.
RDKit is a free and open-source cheminformatics library: www.rdkit.org

In [None]:
# install RDKit
!pip install rdkit

In [None]:
# import RDKit
from rdkit import Chem
smiles = "NCCCCO"  # SMILES are used to describe molecules via strings
mol = Chem.MolFromSmiles(smiles) # A RDKit molecular object is created
mol

In [None]:
# computing molecular descriptors
from rdkit.Chem import Descriptors
Descriptors.MolWt(mol)


In [None]:
Descriptors.MolLogP(mol), Descriptors.MolMR(mol)


## Lipinski's Rule of 5

Lipinski's Rule of 5 is a set of simple rules used to predict the drug-likeness of a molecule.

In [None]:
def check_lipinski(mol):
    """Check Lipinski's Rule of 5 for drug-likeness"""
    mw = Descriptors.ExactMolWt(mol)
    logp = Descriptors.MolLogP(mol)
    hbd = Descriptors.NumHDonors(mol)
    hba = Descriptors.NumHAcceptors(mol)
    
    print("Lipinski's Rule of 5 Analysis:")
    print(f"Molecular Weight: {mw:.2f} (<500)")
    print(f"LogP: {logp:.2f} (<5)")
    print(f"H-bond Donors: {hbd} (<5)")
    print(f"H-bond Acceptors: {hba} (<10)")
    
    violations = sum([
        mw > 500,
        logp > 5,
        hbd > 5,
        hba > 10
    ])
    print(f"Number of violations: {violations}")
    return violations <= 1

check_lipinski(mol)

## Molecular fingerprints

Molecular fingerprints are a way to represent the structure of a molecule in a way that can be used for machine learning.


In [None]:
# computing "fingerprints
from rdkit.Chem import AllChem
fp = AllChem.GetMorganFingerprintAsBitVect(mol, 2)
fp.ToBitString()

## 3D visualization

3D visualization is a way to visualize the structure of a molecule in 3D space.
Usually, a 3D structure is optimized using a force field (e.g. MMFF: Merck Molecular Force Field) starting from an estimated structure.


In [None]:
mol_3d = Chem.AddHs(mol)  # Add hydrogen atoms
AllChem.EmbedMolecule(mol_3d)  # Estimate 3D coordinates
AllChem.MMFFOptimizeMolecule(mol_3d)  # Optimize the structure


In [None]:
#!pip install nglview
import nglview as nv
nv.show_rdkit(mol_3d)