# Quickstart

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cbouy/mols2grid/blob/master/docs/notebooks/quickstart.ipynb)

The easiest way to use mols2grid is through the `mols2grid.display` function. The input can be a DataFrame, a list of RDKit molecules, or an SDFile.

In [None]:
# uncomment and run if you're on Google Colab
# !pip install rdkit mols2grid
# !wget https://raw.githubusercontent.com/rdkit/rdkit/master/Docs/Book/data/solubility.test.sdf

In [None]:
from pathlib import Path

from rdkit import RDConfig

import mols2grid


SDF_FILE = (
    f"{RDConfig.RDDocsDir}/Book/data/solubility.test.sdf"
    if Path(RDConfig.RDDocsDir).is_dir()
    else "solubility.test.sdf"
)

Let's start with an SDFile (`.sdf` and `.sdf.gz` are both supported):

In [None]:
mols2grid.display(SDF_FILE)

From this interface, you can:

- Make simple text searches using the searchbar on the bottom right
- Make substructure queries by clicking on 🔎 > SMARTS and typing in the searchbar
- Sort molecules by clicking on `Sort by` and selecting a field (click again to reverse the order)
- Select a couple of molecules (click on the checkbox) and then export the selection to a SMILES or CSV file, or directly to the clipboard (this last functionality might be blocked depending on how you are running the notebook)

We can also use a pandas DataFrame as input, containing a column of RDKit molecules (specified using `mol_col=...`) or SMILES strings (specified using `smiles_col=...`):

In [None]:
df = mols2grid.sdf_to_dataframe(SDF_FILE)
subset_df = df.sample(50, random_state=0xac1d1c)
mols2grid.display(subset_df, mol_col="mol")

Finally, we can also use a list of RDKit molecules:

In [None]:
mols = subset_df["mol"].to_list()
mols2grid.display(mols)

But the main point of mols2grid is that the widget let's you access your selections from Python afterwards:

In [None]:
mols2grid.get_selection()

If you were using a DataFrame, you can get the subset corresponding to your selection with:

In [None]:
df.iloc[list(mols2grid.get_selection().keys())]

Finally, you can save the grid as a standalone HTML document. Simply replace `display` by `save` and add the path to the output file with `output="path/to/molecules.html"`

In [None]:
mols2grid.save(mols, output="quickstart-grid.html")