# Using the InteractiveMolecule widget

For this example we need to install pandas and RDKit. If you don't have these packages yet, just execute the cell bellow to install. Note that you will need conda to install RDKit.

In [4]:
!conda install -c conda-forge -y pandas rdkit

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /usr/local/Caskroom/miniconda/base

  added / updated specs:
    - pandas
    - rdkit


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    boost-1.74.0               |   py38hb0f0857_5         320 KB  conda-forge
    conda-4.12.0               |   py38h50d1736_0        1020 KB  conda-forge
    greenlet-1.1.2             |   py38ha048514_1          86 KB  conda-forge
    kiwisolver-1.3.2           |   py38he9d5cce_0          52 KB
    matplotlib-base-3.4.3      |   py38hc7d2367_2         7.3 MB  conda-forge
    numpy-1.20.3               |   py38h5cb586d_2         5.6 MB  conda-forge
    pillow-7.2.0               |   py38hef457fe_2         639 KB  conda-forge
    pycairo-1.21.0             |   py38h2e817b2_1         102 KB  conda-forge
    rdkit-2021.03.5         

Now we can import the `trident_chemwidgets` and the `pandas` lib to import our csv dataset.

In [5]:
import trident_chemwidgets as tcw
import pandas as pd
from rdkit import Chem

ModuleNotFoundError: No module named 'rdkit'

Now we can create a small function to featurize our molecules with basic information per atom. 

**IMPORTANT:** *the order of the data rows in the pandas DataFrame or dict must match the standard ordering of atoms as returned by the RDKit `.GetAtoms()` function.* You can generate this data any way you see fit (e.g. calculated values from RDKit as in the function below or attention values from a Graph Attention Network. The only constraint is the atom ordering. If you are using RDKit-based featurizers like those from DeepChem, this standard ordering should already be the default. Take care when using cutom featurizers.

In [None]:
def featurize_mol(smiles):
    # Init feature dict
    feature_dict = {
        'Chiral Tag': [],
        'Formal Charge': [],
        'Mass': [],
        'Total Hs': [],
        'Total Valence': []
    }
    
    # Get atoms from SMILES
    atoms = Chem.MolFromSmiles(smiles).GetAtoms()
    
    # Use RDKit to get all the atom properties
    for atom in atoms:
        feature_dict['Chiral Tag'].append(atom.GetChiralTag())
        feature_dict['Formal Charge'].append(atom.GetFormalCharge())
        feature_dict['Mass'].append(atom.GetMass())
        feature_dict['Total Hs'].append(atom.GetTotalNumHs())
        feature_dict['Total Valence'].append(atom.GetTotalValence())
        
    return pd.DataFrame.from_dict(feature_dict)

Here we'll be exploring the atom features from the ibuprofen molecule, smiles string `CC(C)CC1=CC=C(C=C1)C(C)C(=O)O`. We'll use the function we defined above to get some data at the atom level.

In [None]:
atom_data = featurize_mol('CC(C)CC1=CC=C(C=C1)C(C)C(=O)O')
atom_data.head()

Now we can use the InteractiveMolecule widget to explore the data attached to each atom.

In [None]:
w = tcw.InteractiveMolecule('CC(C)CC1=CC=C(C=C1)C(C)C(=O)O', data=atom_data)
w

The value of the widget will match what you typed into the input.

In [None]:
w.smiles